7-STABLE panic: mtx_lock() of spin mutex %s @ %s:%d

Max Laier max at love2party.net
Mon Nov 26 16:30:10 PST 2007


On Tuesday 27 November 2007, Florian Smeets wrote:
> Hi
>
> i was able to reproduce a hang on a 7-STABLE (csuped just after Scotts
> critical section MFC) firewall which runs mpd4 from ports and uses pf
> for packet filtering. Sometimes when i restart mpd4 the box just hangs.
> I have a up-script which calls /sbin/pfctl -f /etc/pf.conf.

It's an ALTQ problem, really.  I don't know if there is a down-script that 
runs before the interface is destroyed, but if there is - adding 
pfctl -Fq there should work around your problem.  I did have some patches 
to have ALTQ working with dynamic interfaces, but I must have dropped 
them.  I'll see what I can dig up in the next few days.

In short:

altq on foo0 ...

ifconfig foo0 destroy -> BOOM!

mpd has the unfortunate habit of destroying it's interface on session 
restart - there might be ways to change that, though.

> After adding witness and invariants instead of the hang i get this LOR:
>
> Nov 26 18:06:39 fw-a kernel: lock order reversal:
> Nov 26 18:06:39 fw-a kernel: 1st 0xc0762a8c pf task mtx (pf task mtx) @
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:1304
> Nov 26 18:06:39 fw-a kernel: 2nd 0xc07c470c ifnet (ifnet) @
> /usr/src/sys/net/if.c:1494
> Nov 26 18:06:39 fw-a kernel: KDB: stack backtrace:
> Nov 26 18:06:39 fw-a kernel:
> db_trace_self_wrapper(c0707544,ccdb7a3c,c054c0ae,c07098ff,c07c470c,...)
> at db_trace_self_wrapper+0x26
> Nov 26 18:06:39 fw-a kernel:
> kdb_backtrace(c07098ff,c07c470c,c0710de4,c0710de4,c0710c3d,...) at
> kdb_backtrace+0x29
> Nov 26 18:06:39 fw-a kernel:
> witness_checkorder(c07c470c,9,c0710c3d,5d6,0,...) at
> witness_checkorder+0x6de
> Nov 26 18:06:39 fw-a kernel:
> _mtx_lock_flags(c07c470c,0,c0710c3d,5d6,c205e520,...) at
> _mtx_lock_flags+0xbc
> Nov 26 18:06:39 fw-a kernel:
> ifunit(c205e520,0,c06f5a82,518,ccdb7ab8,...) at ifunit+0x2f
> Nov 26 18:06:39 fw-a kernel:
> pfioctl(c1d82600,c0104414,c205e520,3,c224da50,...) at pfioctl+0x2341
> Nov 26 18:06:39 fw-a kernel:
> devfs_ioctl_f(c1e55ca8,c0104414,c205e520,c1c97c00,c224da50,...) at
> devfs_ioctl_f+0xc9
> Nov 26 18:06:39 fw-a kernel:
> kern_ioctl(c224da50,3,c0104414,c205e520,1000000,...) at
> kern_ioctl+0x243 Nov 26 18:06:39 fw-a kernel:
> ioctl(c224da50,ccdb7cfc,c,c072bdfe,c0739810,...) at ioctl+0x134
> Nov 26 18:06:39 fw-a kernel: syscall(ccdb7d38) at syscall+0x2b3
> Nov 26 18:06:39 fw-a kernel: Xint0x80_syscall() at
> Xint0x80_syscall+0x20 Nov 26 18:06:39 fw-a kernel: --- syscall (54,
> FreeBSD ELF32, ioctl), eip = 0x281a6d43, esp = 0xbfbfde4c, ebp =
> 0xbfbfde78 ---
>
>
> and shortly after that this page fault:
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address	= 0xdeadc0de
> fault code		= supervisor read, page not present
> instruction pointer	= 0x20:0xc059e4a8
> stack pointer	        = 0x28:0xccd41890
> frame pointer	        = 0x28:0xccd41890
> code segment		= base 0x0, limit 0xfffff, type 0x1b
> 			= DPL 0, pres 1, def32 1, gran 1
> processor eflags	= interrupt enabled, resume, IOPL = 0
> current process		= 1474 (pfctl)
> Physical memory: 245 MB
> Dumping 60 MB: 45 29 13
>
> #0  doadump () at pcpu.h:195
> 195	pcpu.h: No such file or directory.
> 	in pcpu.h
> (kgdb) list *0xc059e4a8
> 0xc059e4a8 is in strlen (/usr/src/sys/libkern/strlen.c:41).
> 36	strlen(str)
> 37		const char *str;
> 38	{
> 39		register const char *s;
> 40
> 41		for (s = str; *s; ++s);
> 42		return(s - str);
> 43	}
> 44
> (kgdb) bt
> #0  doadump () at pcpu.h:195
> #1  0xc046dc59 in db_fncall (dummy1=-858515768, dummy2=0, dummy3=12,
> dummy4=0xccd41634 "") at /usr/src/sys/ddb/db_command.c:486
> #2  0xc046e1c5 in db_command_loop () at
> /usr/src/sys/ddb/db_command.c:401 #3  0xc046f935 in db_trap (type=12,
> code=0) at
> /usr/src/sys/ddb/db_main.c:222
> #4  0xc053bbe4 in kdb_trap (type=12, code=0, tf=0xccd41850) at
> /usr/src/sys/kern/subr_kdb.c:502
> #5  0xc06d02df in trap_fatal (frame=0xccd41850, eva=3735929054) at
> /usr/src/sys/i386/i386/trap.c:863
> #6  0xc06d0513 in trap_pfault (frame=0xccd41850, usermode=0,
> eva=3735929054) at /usr/src/sys/i386/i386/trap.c:785
> #7  0xc06d0eb5 in trap (frame=0xccd41850) at
> /usr/src/sys/i386/i386/trap.c:463
> #8  0xc06b970b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> #9  0xc059e4a8 in strlen (str=0xdeadc0de <Address 0xdeadc0de out of
> bounds>) at /usr/src/sys/libkern/strlen.c:41
> #10 0xc053e4e5 in kvprintf (fmt=0xc0703a1c " @ %s:%d", func=0xc053d9a0
> <snprintf_func>, arg=0xccd419d0, radix=10, ap=0xccd41a0c " Mo??\001")
>      at /usr/src/sys/kern/subr_prf.c:750
> #11 0xc053ed2b in vsnprintf (str=0xc0779380 "mtx_lock() of spin mutex
> ", size=256, format=0xc0703a01 "mtx_lock() of spin mutex %s @ %s:%d",
> ap=0xccd41a08 "???? Mo??\001") at /usr/src/sys/kern/subr_prf.c:483 #12
> 0xc05152f3 in panic (fmt=0xc0703a01 "mtx_lock() of spin mutex %s @
> %s:%d") at /usr/src/sys/kern/kern_shutdown.c:530
> #13 0xc0509eba in _mtx_lock_flags (m=0xc1d3ad0c, opts=0,
> file=0xc06f4d20 "/usr/src/sys/contrib/altq/altq/altq_priq.c", line=416)
> at
> /usr/src/sys/kern/kern_mutex.c:180
> #14 0xc0437b83 in priq_class_destroy (cl=0xc22138c0) at
> /usr/src/sys/contrib/altq/altq/altq_priq.c:416
> #15 0xc0437c7d in priq_remove_altq (a=0xc2048a80) at
> /usr/src/sys/contrib/altq/altq/altq_priq.c:252
> #16 0xc0438635 in altq_remove (a=0xc2048a80) at
> /usr/src/sys/contrib/altq/altq/altq_subr.c:650
> #17 0xc04508ee in pf_commit_altq (ticket=5) at
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:867
> #18 0xc0456341 in pfioctl (dev=0xc1d82600, cmd=3222029394,
> addr=0xc22d32e0 "\a", flags=3, td=0xc2014a50) at
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:3170
> #19 0xc04bcfe9 in devfs_ioctl_f (fp=0xc20270d8, com=3222029394,
> data=0xc22d32e0, cred=0xc200bd00, td=0xc2014a50) at
> /usr/src/sys/fs/devfs/devfs_vnops.c:494
> #20 0xc054d563 in kern_ioctl (td=0xc2014a50, fd=3, com=3222029394,
> data=0xc22d32e0 "\a") at file.h:266
> #21 0xc054d6c4 in ioctl (td=0xc2014a50, uap=0xccd41cfc) at
> /usr/src/sys/kern/sys_generic.c:570
> #22 0xc06d07f3 in syscall (frame=0xccd41d38) at
> /usr/src/sys/i386/i386/trap.c:1008
> #23 0xc06b9770 in Xint0x80_syscall () at
> /usr/src/sys/i386/i386/exception.s:196
> #24 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) frame 13
> #13 0xc0509eba in _mtx_lock_flags (m=0xc1d3ad0c, opts=0,
> file=0xc06f4d20 "/usr/src/sys/contrib/altq/altq/altq_priq.c", line=416)
> at
> /usr/src/sys/kern/kern_mutex.c:180
> 180		KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_sleep,
> (kgdb) list
> 175	{
> 176
> 177		MPASS(curthread != NULL);
> 178		KASSERT(m->mtx_lock != MTX_DESTROYED,
> 179		    ("mtx_lock() of destroyed mutex @ %s:%d", file, line));
> 180		KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_sleep,
> 181		    ("mtx_lock() of spin mutex %s @ %s:%d",
> m->lock_object.lo_name, 182		    file, line));
> 183		WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER |
> LOP_EXCLUSIVE,
> 184		    file, line);
> (kgdb)
>
>
> As the panic/page fault seems to be connected to the altq/priq rules i
> have added the definition of the altq part
>
> altq on $ext_if priq bandwidth 960Kb qlimit 250 queue { std_out,
> smtp_out, ssh_out, tcp_ack_out }
>
> queue smtp_out     priq(default) qlimit 250
> queue std_out    priority 2 qlimit 250
> queue ssh_out     priority 5 qlimit 250
> queue tcp_ack_out priority 6 qlimit 250
>
> And of course there are some pass rules which add packets to the
> queues.
>
> Is there anything else needed from the core dump?


-- 
/"\  Best regards,                      | mlaier at freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier at EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: This is a digitally signed message part.
Url : http://lists.freebsd.org/pipermail/freebsd-pf/attachments/20071127/68bf15b4/attachment.pgp


More information about the freebsd-pf mailing list