Kernel crash during video transcoding
Alexandre Levy
a13xlevy at gmail.com
Sat Aug 15 19:35:19 UTC 2020
Hi,
I could finally generate a crash dump even with a black screen, I had to
guess I was in the crash handler and I type "dump" and enter which worked.
The driver logs "[drm] Cannot find any crtc or sizes" which I guess is the
reason why I couldn't see anything on my screen.
Back to the initial problem, I could start a kgdb session, loaded the
i915kms.ko symbols and here are the results :
(kgdb) bt
#0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:394
#2 0xffffffff8049c26a in db_dump (dummy=<optimized out>,
dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>) at
/usr/src/sys/ddb/db_command.c:575
#3 0xffffffff8049c02c in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=1) at /usr/src/sys/ddb/db_command.c:482
#4 0xffffffff8049bd9d in db_command_loop () at
/usr/src/sys/ddb/db_command.c:535
#5 0xffffffff8049f048 in db_trap (type=<optimized out>, code=<optimized
out>) at /usr/src/sys/ddb/db_main.c:270
#6 0xffffffff80c1b374 in kdb_trap (type=3, code=0, tf=<optimized out>) at
/usr/src/sys/kern/subr_kdb.c:699
#7 0xffffffff8100ca98 in trap (frame=0xfffffe00d7567300) at
/usr/src/sys/amd64/amd64/trap.c:576
#8 <signal handler called>
#9 kdb_enter (why=0xffffffff811d5de0 "panic", msg=<optimized out>) at
/usr/src/sys/kern/subr_kdb.c:486
#10 0xffffffff80bd00be in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:902
#11 0xffffffff80bcfe53 in panic (fmt=0xffffffff81c8c7c8 <cnputs_mtx>
"\b\214\031\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:839
#12 0xffffffff8100cee7 in trap_fatal (frame=0xfffffe00d7567600, eva=0) at
/usr/src/sys/amd64/amd64/trap.c:915
#13 0xffffffff8100c360 in trap (frame=0xfffffe00d7567600) at
/usr/src/sys/amd64/amd64/trap.c:212
#14 <signal handler called>
#15 _rw_wowned (c=0x2659c92217d5aa52) at /usr/src/sys/kern/kern_rwlock.c:270
#16 0xffffffff80ec23ed in vm_page_busy_acquire (m=0xfffffe00040ff9e8,
allocflags=16) at /usr/src/sys/vm/vm_page.c:884
#17 0xffffffff82b4e980 in remap_io_mapping (vma=0xfffff80315148300,
addr=<optimized out>, pfn=<optimized out>, size=<optimized out>,
iomap=<optimized out>)
at
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/intel_freebsd.c:193
#18 0xffffffff82be1c5f in i915_gem_fault (dummy=<optimized out>,
vmf=<optimized out>)
at
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/gem/i915_gem_mman.c:367
#19 0xffffffff82cb5ddf in linux_cdev_pager_populate
(vm_obj=0xfffff80368501420, pidx=<optimized out>, fault_type=<optimized
out>, max_prot=<optimized out>,
first=0xfffffe00d7567868, last=0xfffffe00d7567888) at
/usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:554
#20 0xffffffff80ea9e8f in vm_pager_populate (object=0x2659c92217d5aa52,
pidx=18446741874754451944, fault_type=0, max_prot=0 '\000',
first=<optimized out>, last=<optimized out>)
at /usr/src/sys/vm/vm_pager.h:172
#21 vm_fault_populate (fs=<optimized out>) at /usr/src/sys/vm/vm_fault.c:444
#22 vm_fault_allocate (fs=<optimized out>) at
/usr/src/sys/vm/vm_fault.c:1028
#23 vm_fault (map=<optimized out>, vaddr=<optimized out>,
fault_type=<optimized out>, fault_flags=<optimized out>, m_hold=<optimized
out>) at /usr/src/sys/vm/vm_fault.c:1338
#24 0xffffffff80ea98ee in vm_fault_trap (map=0xfffffe00c0f539e8,
vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=0,
signo=0xfffffe00d7567ac4,
ucode=0xfffffe00d7567ac0) at /usr/src/sys/vm/vm_fault.c:585
#25 0xffffffff8100d0de in trap_pfault (frame=0xfffffe00d7567b00,
usermode=<optimized out>, signo=<optimized out>, ucode=0xffffffff81d1de80
<w_locklistdata+160624>)
at /usr/src/sys/amd64/amd64/trap.c:817
#26 0xffffffff8100c72c in trap (frame=0xfffffe00d7567b00) at
/usr/src/sys/amd64/amd64/trap.c:340
#27 <signal handler called>
#28 0x000000080296659a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffbf38
(kgdb) list *0xffffffff82be1c5f
0xffffffff82be1c5f is in i915_gem_fault
(/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/gem/i915_gem_mman.c:367).
362 ret = i915_vma_pin_fence(vma);
363 if (ret)
364 goto err_unpin;
365
366 /* Finally, remap it using the new GTT offset */
367 ret = remap_io_mapping(area,
368 area->vm_start +
(vma->ggtt_view.partial.offset << PAGE_SHIFT),
369 (ggtt->gmadr.start +
vma->node.start) >> PAGE_SHIFT,
370 min_t(u64, vma->size, area->vm_end -
area->vm_start),
371 &ggtt->iomap);
(kgdb) list *0xffffffff82b4e980
0xffffffff82b4e980 is in remap_io_mapping
(/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/intel_freebsd.c:193).
188 pidx++, pa += PAGE_SIZE) {
189 retry:
190 m = vm_page_grab(vm_obj, pidx, VM_ALLOC_NOCREAT);
191 if (m == NULL) {
192 m = PHYS_TO_VM_PAGE(pa);
193 if (!vm_page_busy_acquire(m,
VM_ALLOC_WAITFAIL))
194 goto retry;
195 if (vm_page_insert(m, vm_obj, pidx)) {
196 vm_page_xunbusy(m);
197 VM_OBJECT_WUNLOCK(vm_obj);
(kgdb) list *0xffffffff80ec23ed
0xffffffff80ec23ed is in vm_page_busy_acquire
(/usr/src/sys/vm/vm_page.c:884).
879 if (vm_page_tryacquire(m, allocflags))
880 return (true);
881 if ((allocflags & VM_ALLOC_NOWAIT) != 0)
882 return (false);
883 if (obj != NULL)
884 locked = VM_OBJECT_WOWNED(obj);
885 else
886 locked = false;
887 MPASS(locked || vm_page_wired(m));
888 if (_vm_page_busy_sleep(obj, m, m->pindex, "vmpba",
allocflags,
It seems like the problem occured when calling vm_page_busy_acquire(m,
VM_ALLOC_WAITFAIL) where m might be a NULL pointer ? I am very new to
kernel debugging so not sure where to go from there.
Thanks.
Le lun. 10 août 2020 à 12:04, Hans Petter Selasky <hps at selasky.org> a
écrit :
> Hi,
>
> On 2020-08-10 12:59, Alexandre Levy wrote:
> > Looking at the code, the error happens during the call to VM_OBJECT_WLOCK
> > (memory page locking for write ?) in the intel_freebsd.c (see [1] below).
> > I'm out for a few days but I'll try to dig more into it when I'm back
> next
> > weekend although I have no experience in the drm-devel-kmod codebase. In
> > the meantime if you have any suggestions on debugging this further I'm
> > happy to follow them.
>
> The problem is likely that the vm_obj is NULL.
>
> I think I recall that this function is special and can only be called
> from a certain context, unlike in Linux. Will need the full backtrace
> with line numbers in order to debug this.
>
> --HPS
>
More information about the freebsd-current
mailing list