Re:_-stable_from_today_dumps_core_with_ drm-510-kmod_and_some_graphical_clients

From: Mathias Picker <Mathias.Picker_at_virtual-earth.de>
Date: Tue, 11 Apr 2023 15:03:06 UTC
Am 11. April 2023 16:17:13 MESZ schrieb "Ulrich Spörlein" <uqs@freebsd.org>:
>On Thu, Mar 30, 2023 at 3:29 PM Mathias Picker <
>Mathias.Picker@virtual-earth.de> wrote:
>
>>
>> Cy Schubert <Cy.Schubert@cschubert.com> writes:
>>
>> > On Mon, 27 Mar 2023 23:43:35 +0200
>> > Mathias Picker <Mathias.Picker@virtual-earth.de> wrote:
>> >
>> >> Am 27. März 2023 23:05:35 MESZ schrieb Cy Schubert
>> >> <Cy.Schubert@cschubert.com>:
>> >> >In message
>> >> ><8b47d0a4-a8f1-1841-ee59-3949fe69cbd7@ShaneWare.Biz>, Shane
>> >> >Ambler w
>> >> >rites:
>> >> >> On 26/3/23 01:37, Mathias Picker wrote:
>> >> >> >
>> >> >> > Starting sddm works fine, starting my normal session
>> >> >> > crashes or freezes
>> >> >> > FreeBSD.
>> >> >> >
>> >> >> > I can find no error messages after a reboot.
>> >> >> >
>> >> >> > I found out, that I can start xterm or emacs (exwm)
>> >> >> > without problems,
>> >> >> > xrandr works with external screen, but once I start
>> >> >> > anything more
>> >> >> > demanding (I guess demanding of the GPU) everything
>> >> >> > freezes or FreeBSD
>> >> >> > even reboots.
>> >> >> >
>> >> >> > “Demanding† means even simple things like
>> >> >> > qterminal. I tried firefox an
>> >> >> d
>> >> >> > blender and then I had it with the reboots and
>> >> >> > didn’t try anything else.
>> >> >> > xedit works fine :)
>> >> >> >
>> >> >> > I have nothing in the logs, I have no idea where to look
>> >> >> > or how to debug
>> >> >> > this.
>> >> >> >
>> >> >> > Any ideas, tipps, help greatly apreciated.
>> >> >>
>> >> >>
>> >> >> FreeBSD Developers Handbook Chapter 10: Kernel Debugging
>> >> >>
>> >> >> https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/
>> >> >>
>> >> >> Running stable, kernel dumps may already be enabled, look in
>> >> >> /var/crash
>> >> >>
>> >> >> By enabling a kernel dump when it panics (dumpdev="AUTO" in
>> >> >> rc.conf) the
>> >> >> kernel core is saved to swap space, then on reboot gets
>> >> >> copied to
>> >> >> dumpdir (/var/crash) where you can then use kgdb (from
>> >> >> devel/gdb) to get
>> >> >> a stack trace to find where the panic happened.
>> >> >
>> >> >drm-*-kmod probably needs a rebuild. Likely a data structure
>> >> >changed. In my
>> >> >experience a simple rebuild of the port solves 90% of
>> >> >drm-*-kmod crash
>> >> >problems.
>> >> >
>> >> Hi Cy,
>> >>
>> >> sorry I didn't mention that, but I did rebuild drm-kmod, I
>> >> actually do it after every new kernel build, just to be on the
>> >> safe side.
>> >>
>> >> I switched my swap to non-encrypted and will look if I can get
>> >> any information from the kernel dump tomorrow.
>> >>
>> >> Oh, and it's on a Thinkpad X1 Yoga 3rd gen, I just noticed I
>> >> didn't mention this.
>> >
>> > It may be worth trying drm-515-kmod as some MFC that works with
>> > 515 and
>> > not 510 may have been committed. Linux-KPI commits are the usual
>> > suspects.
>> >
>> > I use drm-515 with 14-CURRENT.
>>
>> Finally I found the time for a kernel crash dump.
>> This is what kgdb says
>>
>> mathiasp:amd64.amd64/sys/GENERIC% sudo kgdb kernel
>> /var/crash/vmcore.2
>> GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
>> Copyright (C) 2023 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.
>> Type "show copying" and "show warranty" for details.
>> This GDB was configured as "x86_64-portbld-freebsd13.1".
>> Type "show configuration" for configuration details.
>> For bug reporting instructions, please see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>> Find the GDB manual and other documentation resources online at:
>>     <http://www.gnu.org/software/gdb/documentation/>.
>>
>> For help, type "help".
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from kernel...
>> Reading symbols from
>> /usr/obj/usr/src/amd64.amd64/sys/GENERIC/kernel.debug...
>>
>> Unread portion of the kernel message buffer:
>>
>>
>> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>> 55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
>> (offsetof(struct pcpu,
>> (kgdb) backtrace
>> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>> #1  doadump (textdump=<optimized out>) at
>>  /usr/src/sys/kern/kern_shutdown.c:396
>> #2  0xffffffff80c07c2a in kern_reboot (howto=260) at
>>  /usr/src/sys/kern/kern_shutdown.c:484
>> #3  0xffffffff80c080ce in vpanic (fmt=<optimized out>,
>>  ap=ap@entry=0xfffffe01341fab50) at
>>  /usr/src/sys/kern/kern_shutdown.c:923
>> #4  0xffffffff80c07f03 in panic (fmt=<unavailable>) at
>>  /usr/src/sys/kern/kern_shutdown.c:847
>> #5  0xffffffff810c1fa7 in trap_fatal (frame=0xfffffe01341fac40,
>>  eva=0) at /usr/src/sys/amd64/amd64/trap.c:942
>> #6  0xffffffff810c1fff in trap_pfault (frame=0xfffffe01341fac40,
>>  usermode=false, signo=<optimized out>, ucode=<optimized out>)
>>     at /usr/src/sys/amd64/amd64/trap.c:761
>> #7  <signal handler called>
>> #8  0xffffffff84a07067 in shmem_get_pages () from
>>  /boot/modules/i915kms.ko
>> #9  0x0000000300000015 in ?? ()
>> #10 0x0000000000000060 in ?? ()
>> #11 0x0000000000000060 in ?? ()
>> #12 0x0000000000060000 in ?? ()
>> #13 0xfffffe00dc365a80 in ?? ()
>> #14 0xfffff00100000060 in ?? ()
>> #15 0xfffff8003e270c00 in ?? ()
>> #16 0x00000000fffff000 in ?? ()
>> #17 0xfffff8002138fc20 in ?? ()
>> #18 0xfffffe00dc365a80 in ?? ()
>> #19 0x0000000000000060 in ?? ()
>> #20 0xfffff8003e270c00 in ?? ()
>> #21 0x0000000000000060 in ?? ()
>> #22 0xfffffe0131e0fc80 in ?? ()
>> #23 0xfffffe01341fade0 in ?? ()
>> #24 0xffffffff84a07596 in shmem_pwrite () from
>>  /boot/modules/i915kms.ko
>> #25 0x0000000000000000 in ?? ()
>> (kgdb)
>>
>>
>> Anything else I can do to help?
>>
>> I’m now building drm-515-kmod, let’s see how that works in
>> -stable.
>>
>> /Mathias
>>
>>
>Any updates here? I just ran into this myself and am very close to just
>installing Linux on my laptop, tbh.

515 does not build, but RC6 works fine.  

Have not tried -stable again, too much work currently…

Good luck,

Mathias


>
>I've rebuilt stable/13 today, then rebuilt the 510-kmod (because the
>515-kmod doesn't even build) and pretty much anything that's not an XTerm
>will panic/reboot the machine (a Thinkpad T490 with Intel GPU).
>
>dmesg got this to say:
>
>Fatal trap 12: page fault while in kernel mode
>cpuid = 1; apic id = 02
>fault virtual address   = 0x0
>fault code              = supervisor read data, page not present
>instruction pointer     = 0x20:0xffffffff84430626
>stack pointer           = 0x28:0xfffffe0140c83cf0
>frame pointer           = 0x28:0xfffffe0140c83d70
>code segment            = base 0x0, limit 0xfffff, type 0x1b
>                        = DPL 0, pres 1, long 1, def32 0, gran 1
>processor eflags        = interrupt enabled, resume, IOPL = 0
>current process         = 0 (i915-userptr-acquir)
>trap number             = 12
>panic: page fault
>cpuid = 1
>time = 1681221523
>KDB: stack backtrace:
>#0 0xffffffff80c5fc15 at kdb_backtrace+0x65
>#1 0xffffffff80c12e02 at vpanic+0x152
>#2 0xffffffff80c12ca3 at panic+0x43
>#3 0xffffffff810d1577 at trap_fatal+0x387
>#4 0xffffffff810d15cf at trap_pfault+0x4f
>#5 0xffffffff810a8568 at calltrap+0x8
>#6 0xffffffff84430c02 at __i915_gem_userptr_get_pages_worker+0x1f2
>#7 0xffffffff80e80883 at linux_work_fn+0xe3
>#8 0xffffffff80c746f1 at taskqueue_run_locked+0x181
>#9 0xffffffff80c759b3 at taskqueue_thread_loop+0xc3
>#10 0xffffffff80bcf55d at fork_exit+0x7d
>#11 0xffffffff810a95de at fork_trampoline+0xe
>
>It apparently dumps core, will have to reacquaint myself with how to poke
>at this some more...



Mathias Picker
Geschäftsführer
virtual earth Gesellschaft für Wissens re/prä sentation mbH
Westendstr. 142
80339 München
+4915256178344