Assertion lock == sq->sq_lock failed at
/usr/src/sys/kern/subr_sleepqueue.c:327
Kostik Belousov
kostikbel at gmail.com
Mon Jun 8 18:17:40 UTC 2009
On Mon, Jun 08, 2009 at 08:25:50PM +0300, Mikolaj Golub wrote:
> After recent upgrade (on June 3) of my 7-stable host (WITNESS is enabled, the
> previous build was Apr 26), I have been experienced panics when starting some
> our home made application:
>
> Architecture: i386
> Architecture Version: 2
> Dump Length: 73797632B (70 MB)
> Blocksize: 512
> Dumptime: Mon Jun 8 15:29:14 2009
> Hostname: zhuzha.ua1
> Magic: FreeBSD Kernel Dump
> Version String: FreeBSD 7.2-STABLE #18: Wed Jun 3 14:28:49 EEST 2009
> root at zhuzha.ua1:/usr/obj/usr/src/sys/DEBUG
> Panic String: Assertion lock == sq->sq_lock failed at /usr/src/sys/kern/subr_sleepqueue.c:327
> Dump Parity: 1277108796
> Bounds: 18
>
> Narrowing down the problem I have written a simple test program (in attaches)
> that models in simplified way the behaviour or our application and reproduces
> the crash. There are two threads. One of the threads is doing vfork/exec to
> start child process while the other is monitoring the status of the child
> calling wait4().
>
> At the moment of the crash the parent vfork thread is sleeping on
> wchan=0xc4d17ae0 (td->td_proc) with lock=0xc4d178b8 waiting for the child to
> exec. The parent wait4 thread with lock=0xc4d17b70 and the same wchan calls
> sleepq_lookup(wchan) to check if the wait channel already have a sleep queue
> associated with it, finds the queue of the vfork thread, tries to insert the
> current thread into this sleep queue and fails on assertion lock==sq->sq_lock.
>
> wait4 treead:
>
> (kgdb) thr 128
> [Switching to thread 128 (Thread 100164)]#0 doadump () at pcpu.h:196
> 196 in pcpu.h
> (kgdb) bt
> #0 doadump () at pcpu.h:196
> #1 0xc0496509 in db_fncall (dummy1=-1061924641, dummy2=0, dummy3=-1, dummy4=0xe69a089c "")
> at /usr/src/sys/ddb/db_command.c:516
> #2 0xc0496abf in db_command (last_cmdp=0xc0c9ab54, cmd_table=0x0, dopager=0)
> at /usr/src/sys/ddb/db_command.c:413
> #3 0xc0496b34 in db_command_script (command=0xc0c9baa4 "call doadump")
> at /usr/src/sys/ddb/db_command.c:484
> #4 0xc049a3a0 in db_script_exec (scriptname=0xe69a09a8 "kdb.enter.panic", warnifnotfound=Variable "warnifnotfound" is not available.
> )
> at /usr/src/sys/ddb/db_script.c:302
> #5 0xc049a47d in db_script_kdbenter (eventname=0xc0b8fa85 "panic") at /usr/src/sys/ddb/db_script.c:324
> #6 0xc0498388 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:227
> #7 0xc0810d36 in kdb_trap (type=3, code=0, tf=0xe69a0ae0) at /usr/src/sys/kern/subr_kdb.c:524
> #8 0xc0add6ab in trap (frame=0xe69a0ae0) at /usr/src/sys/i386/i386/trap.c:688
> #9 0xc0ac159b in calltrap () at /usr/src/sys/i386/i386/exception.s:159
> #10 0xc0810eba in kdb_enter_why (why=0xc0b8fa85 "panic", msg=0xc0b8fa85 "panic") at cpufunc.h:60
> #11 0xc07e4216 in panic (fmt=0xc0b5ecd3 "Assertion %s failed at %s:%d")
> at /usr/src/sys/kern/kern_shutdown.c:557
> #12 0xc08194dc in sleepq_add (wchan=0xc4d17ae0, lock=0xc4d17b70, wmesg=0xc0b959c1 "wait", flags=256,
> queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:327
> #13 0xc07ec94b in _sleep (ident=0xc4d17ae0, lock=Variable "lock" is not available.
> ) at /usr/src/sys/kern/kern_synch.c:203
> #14 0xc07bd8e6 in kern_wait (td=0xc41e8480, pid=-1, status=0xe69a0c2c, options=Variable "options" is not available.
> )
> at /usr/src/sys/kern/kern_exit.c:902
> #15 0xc07bd95b in wait4 (td=0xc41e8480, uap=0xe69a0cfc) at /usr/src/sys/kern/kern_exit.c:688
> #16 0xc0adce23 in syscall (frame=0xe69a0d38) at /usr/src/sys/i386/i386/trap.c:1090
> #17 0xc0ac1600 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
> #18 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) fr 12
> #12 0xc08194dc in sleepq_add (wchan=0xc4d17ae0, lock=0xc4d17b70, wmesg=0xc0b959c1 "wait", flags=256,
> queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:327
> 327 MPASS(lock == sq->sq_lock);
> (kgdb) p lock
> $1 = (struct lock_object *) 0xc4d17b70
> (kgdb) p sq->sq_lock
> $2 = (struct lock_object *) 0xc4d178b8
> (kgdb) p *lock
> $3 = {lo_name = 0xc0b8e532 "process lock", lo_type = 0xc0b8e532 "process lock", lo_flags = 21168128,
> lo_witness_data = {lod_list = {stqe_next = 0xc0ce4840}, lod_witness = 0xc0ce4840}}
> (kgdb) p *sq->sq_lock
> $4 = {lo_name = 0xc0b8e532 "process lock", lo_type = 0xc0b8e532 "process lock", lo_flags = 21168128,
> lo_witness_data = {lod_list = {stqe_next = 0xc0ce4840}, lod_witness = 0xc0ce4840}}
>
> vfork thread of the parent process:
>
> (kgdb) thr 127
> [Switching to thread 127 (Thread 100162)]#0 sched_switch (td=0xc41e86c0, newtd=Variable "newtd" is not available.
> )
> at /usr/src/sys/kern/sched_ule.c:1944
> 1944 cpuid = PCPU_GET(cpuid);
> (kgdb) bt
> #0 sched_switch (td=0xc41e86c0, newtd=Variable "newtd" is not available.
> ) at /usr/src/sys/kern/sched_ule.c:1944
> #1 0xc07ec4c3 in mi_switch (flags=Variable "flags" is not available.
> ) at /usr/src/sys/kern/kern_synch.c:444
> #2 0xc0818bd2 in sleepq_switch (wchan=0xc4d17ae0) at /usr/src/sys/kern/subr_sleepqueue.c:497
> #3 0xc0819800 in sleepq_wait (wchan=0xc4d17ae0) at /usr/src/sys/kern/subr_sleepqueue.c:580
> #4 0xc07eca69 in _sleep (ident=0xc4d17ae0, lock=0xc4d178b8, priority=92, wmesg=0xc0b8bacb "ppwait",
> timo=0) at /usr/src/sys/kern/kern_synch.c:230
> #5 0xc07c0631 in fork1 (td=0xc41e86c0, flags=-2147483596, pages=0, procp=0xe699ac78)
> at /usr/src/sys/kern/kern_fork.c:747
> #6 0xc07c07c9 in vfork (td=0xc41e86c0, uap=0xe699acfc) at /usr/src/sys/kern/kern_fork.c:124
> #7 0xc0adce23 in syscall (frame=0xe699ad38) at /usr/src/sys/i386/i386/trap.c:1090
> #8 0xc0ac1600 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
> #9 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
>
> And here is the child process:
>
> (kgdb) thr 129
> [Switching to thread 129 (Thread 100163)]#0 sched_switch (td=0xc4d1d900, newtd=Variable "newtd" is not available.
> )
> at /usr/src/sys/kern/sched_ule.c:1944
> 1944 cpuid = PCPU_GET(cpuid);
> (kgdb) bt
> #0 sched_switch (td=0xc4d1d900, newtd=Variable "newtd" is not available.
> ) at /usr/src/sys/kern/sched_ule.c:1944
> #1 0xc07ec4c3 in mi_switch (flags=Variable "flags" is not available.
> ) at /usr/src/sys/kern/kern_synch.c:444
> #2 0xc0818bd2 in sleepq_switch (wchan=0xc0ccdac4) at /usr/src/sys/kern/subr_sleepqueue.c:497
> #3 0xc0818e6f in sleepq_catch_signals (wchan=0xc0ccdac4) at /usr/src/sys/kern/subr_sleepqueue.c:417
> #4 0xc0819647 in sleepq_timedwait_sig (wchan=0xc0ccdac4) at /usr/src/sys/kern/subr_sleepqueue.c:631
> #5 0xc07eca31 in _sleep (ident=0xc0ccdac4, lock=0x0, priority=348, wmesg=0xc0b90bcd "nanslp", timo=2)
> at /usr/src/sys/kern/kern_synch.c:224
> #6 0xc07f3f51 in kern_nanosleep (td=0xc4d1d900, rqt=0xe699dc64, rmt=0xe699dc6c)
> at /usr/src/sys/kern/kern_time.c:373
> #7 0xc07f594f in nanosleep (td=0xc4d1d900, uap=0xe699dcfc) at /usr/src/sys/kern/kern_time.c:415
> #8 0xc0adce23 in syscall (frame=0xe699dd38) at /usr/src/sys/i386/i386/trap.c:1090
> #9 0xc0ac1600 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
> #10 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) p &td->td_proc.p_mtx
> $5 = (struct mtx *) 0xc4d178b8
>
> Actually, I am not sure that the problem is observed only on the recent
> STABLE. It might have not been triggered by our application before the
> upgrade. Currently I don't have the system with some old STABLE to check this
> running the test program.
This is fixed in HEAD by r185647. I did not merged it to 7, because I
do not believe that somebody runs stable with INVARIANTS on :).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090608/942fab5c/attachment.pgp
More information about the freebsd-stable
mailing list