Re: Patched gpsd and /dev/pps0 results in "sleeping thread" kernel panic
- In reply to: Warner Losh : "Re: Patched gpsd and /dev/pps0 results in "sleeping thread" kernel panic"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 01 Sep 2021 06:04:07 UTC
On 8/31/21 9:35 PM, Warner Losh wrote: > Either I'm missing something (likely am), or this might fix it up, > or at least get away from the warning: > > https://reviews.freebsd.org/D31763 <https://reviews.freebsd.org/D31763> > > Note: I can't recall why ppbus has to be locked for this call. > This code dates from the very earliest days of locking and > so may do things simply because it seemed like a good idea > without a specific notion as to what that lock is protecting. If > so, the real fix may be to not take the lock in pps_ioctl at > all and maybe instead use a reference count (the most > often reason for 'a good idea' was to keep the device > from going away, though this is a parent lock, not a > child one so I'm less sure about that being the reason). The crash looks the same or at least very similar to the unpatched kernel. If you'd like to experiment with switching from the lock to a reference count I am able to test that too (as well as testing that it doesn't break with the ntpd's normal use of /dev/pps0). (Do you prefer comments/traces/feedback in this thread or in the review?) Thanks! Craig toc2 1 # kgdb /boot/kernel/kernel /var/crash/vmcore.2 GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD] Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd12.2". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /boot/kernel/kernel... Reading symbols from /boot/kernel.LBLNET/kernel.debug... Unread portion of the kernel message buffer: Sleeping thread (tid 101007, pid 1805) owns a non-sleepable lock KDB: stack backtrace of thread 101007: sched_switch() at sched_switch+0x630/frame 0xfffffe0070e3b760 mi_switch() at mi_switch+0xd4/frame 0xfffffe0070e3b790 sleepq_catch_signals() at sleepq_catch_signals+0x403/frame 0xfffffe0070e3b7e0 sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame 0xfffffe0070e3b820 _sleep() at _sleep+0x1b3/frame 0xfffffe0070e3b8a0 pps_ioctl() at pps_ioctl+0x298/frame 0xfffffe0070e3b8f0 ppsioctl() at ppsioctl+0x48/frame 0xfffffe0070e3b920 devfs_ioctl() at devfs_ioctl+0xb0/frame 0xfffffe0070e3b970 VOP_IOCTL_APV() at VOP_IOCTL_APV+0x7b/frame 0xfffffe0070e3b9a0 vn_ioctl() at vn_ioctl+0x16a/frame 0xfffffe0070e3bab0 devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe0070e3bad0 kern_ioctl() at kern_ioctl+0x2b7/frame 0xfffffe0070e3bb30 sys_ioctl() at sys_ioctl+0xfa/frame 0xfffffe0070e3bc00 amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0070e3bd30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0070e3bd30 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004c899a, rsp = 0x7fffdfdfc6a8, rbp = 0x7fffdfdfc730 --- panic: sleeping thread cpuid = 8 time = 1630475518 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe005ab73ab0 vpanic() at vpanic+0x17b/frame 0xfffffe005ab73b00 panic() at panic+0x43/frame 0xfffffe005ab73b60 propagate_priority() at propagate_priority+0x282/frame 0xfffffe005ab73b90 turnstile_wait() at turnstile_wait+0x30c/frame 0xfffffe005ab73be0 __mtx_lock_sleep() at __mtx_lock_sleep+0x199/frame 0xfffffe005ab73c70 ppcintr() at ppcintr+0x2a0/frame 0xfffffe005ab73c90 ithread_loop() at ithread_loop+0x23c/frame 0xfffffe005ab73cf0 fork_exit() at fork_exit+0x7e/frame 0xfffffe005ab73d30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe005ab73d30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 2m39s Dumping 593 out of 12240 MB:..3%..11%..22%..33%..41%..52%..63%..71%..81%..92% __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory. (kgdb) bt #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=1) at ../../../kern/kern_shutdown.c:371 #2 0xffffffff80b83b2a in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:451 #3 0xffffffff80b83f83 in vpanic (fmt=<optimized out>, ap=<optimized out>) at ../../../kern/kern_shutdown.c:880 #4 0xffffffff80b83da3 in panic (fmt=<unavailable>) at ../../../kern/kern_shutdown.c:807 #5 0xffffffff80be71a2 in propagate_priority (td=0xfffff801c4418000) at ../../../kern/subr_turnstile.c:228 #6 0xffffffff80be7d6c in turnstile_wait (ts=0xfffff800039bae40, owner=<optimized out>, queue=0) at ../../../kern/subr_turnstile.c:785 #7 0xffffffff80b62cf9 in __mtx_lock_sleep (c=0xfffff80003932ad0, v=<optimized out>) at ../../../kern/kern_mutex.c:654 #8 0xffffffff8086fd10 in ppcintr (arg=0xfffff80003932a00) at ../../../dev/ppc/ppc.c:1546 #9 0xffffffff80b463cc in intr_event_execute_handlers (p=<optimized out>, ie=0xfffff800030d9d00) at ../../../kern/kern_intr.c:1143 #10 ithread_execute_handlers (p=<optimized out>, ie=0xfffff800030d9d00) at ../../../kern/kern_intr.c:1156 #11 ithread_loop (arg=0xfffff800039aea00) at ../../../kern/kern_intr.c:1236 #12 0xffffffff80b42e6e in fork_exit ( callout=0xffffffff80b46190 <ithread_loop>, arg=0xfffff800039aea00, frame=0xfffffe005ab73d40) at ../../../kern/kern_fork.c:1080 #13 <signal handler called> (kgdb)