Lock Order Reversal on 7.0-STABLE with pf and ipfw / dummynet
ian j hart
ianjhart at ntlworld.com
Sun Mar 16 15:37:06 PDT 2008
On Sunday 16 March 2008 21:16:16 Alex Popa wrote:
> This is a mixed reply to both the previous mails, bear with me please.
>
> On Sat, Mar 15, 2008 at 10:16:54PM +0100, Max Laier wrote:
> > On Saturday 15 March 2008, Robert Watson wrote:
> > > On Fri, 14 Mar 2008, Alex Popa wrote:
> > > > [snip]
> > > > The LOR messages from dmesg of 7.0-STABLE are as follows:
> > > >
> > > > lock order reversal:
> > > > 1st 0xffffffffb19e0680 pf task mtx (pf task mtx) @
> > > > /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6729 2nd
> > > > 0xffffff00042ea0f0 radix node head (radix node head) @
> > > > /usr/src/sys/net/route.c:147
> >
> > I haven't seen this one before, can you obtain the trace for this,
> > please? You might need KDB & DDB for that - not sure.
>
> I'll do my best (see below for my questions about getting a trace).
>
> > > > lock order reversal:
> > > > 1st 0xffffffff80938508 PFil hook read/write mutex (PFil hook
> > > > read/write mutex) @ /usr/src/sys/net/pfil.c:73 2nd 0xffffffff80938c48
> > > > tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:400
> >
> > This one is most certainly harmless and can be ignored. It is caused by
> > user/group rules, but a LOR with the read instance of a rwlock will not
> > lead to a deadlock.
>
> I'm not using uid/gid/jail rules as far as I remember. I'll add another
> reply with pf.conf and the script I use to generate and reload the ipfw
> rules (but I'll anonymize them).
>
> > > Dear Alex,
> > >
> > > Thanks for this report, and sorry about the problem. It could well be
> > > that the lock order warning from WITNESS is related to the hang, and
> > > might reflect a recursion-related bug in the pf policy routing code.
> > > I'm not sure to what extent you can tolerate further downtime, but it
> > > would be useful to gather some more information about the hang itself
> > > to try and confirm the involvement of lock order. In particular, if
> > > it's feasible, it would be very helpful if you could boot back to the
> > > 7-STABLE kernel (keeping the 6.2-STABLE userspace should be fine, I
> >
> > you'll need at least a new pfctl, because the ioctl interface to /dev/pf
> > has changed.
>
> Switching between 6.2-RELEASE-p7 (not STABLE, because as I said 6.3
> exhibited the lockups too) and 7-STABLE isn't that much of a problem.
> The upgrade path was "buy a new hard drive, set up everything and then
> adapt the old config files"... actually we bought 2 harddrives, and I
> set them up one with amd64 and another with i386. I think /etc and
> /usr/local/etc are perfectly identical on these 2 (I adapted the configs
> from 6.2 to 7.0, but I just copied them from amd64 to i386).
>
> So, actions needed to switch: Backup the database on 6.2 (with IP/MAC
> mappings and a bit more), put in the 7.0 hard drive, boot off 7.0,
> restore DB, let it run. Total downtime should be around 7 minutes tops.
>
> > > think), and when the hang occurs, use the console debuggger (ideally
> > > hooked up to serial or firewire) to run the following debugging
> > > commands:
> > >
> > > show pcpu
> > > show allpcpu
> > > trace
> > > alltrace
> > > show allocks
> > > show witness
> > > show lockedvnods
> > > show uma
> > > show malloc
>
> This is where things get a bit tricky, and I need advice.
>
> As I said, my observation is that the keyboard seems to stop working
> when the lockup occurs, that is, pressing Num Lock won't toggle the
> state of the LED. Thus I have some doubts that trying the good-old
> Control-Alt-ESC would have the desired effect (dropping me into the
> debugger). However, I'm not that familiar with the FreeBSD
> architecture, and wouldn't be surprised if the LED toggling would be in
> another thread and the macine will actually respond to the keyboard
> interrupt and drop me into ddb. Also, judging by the lack of NumLock
> activity (it works fine when the system's up), would serial console or
> firewire be functional during the lockup?
Keyboard LEDs are broken for me on 6.3 amd64 (kbdmux).
I'd double check they work before you rely on this as a diagnostic tool.
>
> Also, a bit of explanations:
>
> Why I'm asking the above: The current motherboard has a serial port
> (and it works, we've used it), but not a firewire port. The other
> motherboard we tried has firewire, but no serial. As a console
> workstation, I can get a few with serials, but not so easy with
> firewire. The null modem cable might be a problem too, depending on
> length.
>
> Also, since the lockup isn't easily reproducible, I'll probably need to
> spend some hours on location and if I'm going to do that, I'd like a
> degree of hope that either keyboard, serial console or firewire will
> work. Also, firewire will require me to switch motherboards, but that
> can be done together with the hard drive swapping, during the night.
>
> After a bit of studying NOTES, I was wondering if a combination of
> serial console (or just plain console) with "options WITNESS_KDB" would
> help get a "good enough" trace. The upside of this is that both LORs
> usually occur early (not much later than the login prompt, usually
> earlier) as opposed to after 12...18 hours, and I can either force a
> panic after each and get 2 core dumps, or run the debug commands
> suggested (either as debug LOR1 / continue / debug LOR2, or debug LOR1 /
> reboot / "continue" LOR1 / debug LOR2 - whichever is more appropriate).
>
> For the moment I have both hard drives (7.0-STABLE/amd64 and
> 7.0-RELEASE/i386) and the new motherboard (no serial, but with firewire)
> as a working computer under my desk. I can prepare for the night-time
> switch and debug by compiling kernel and/or world and doing some
> preliminary testing here. If I really need to test null modem console,
> I can put the hdd in my own desktop and test with another machine.
>
> > > A shot-in-the-dark guess is that something about pf's interactions with
> > > the protocol stack is involved here, but unfortunately I suspect we'll
> > > need some more information to track it down.
> > >
> > > Also, could you confirm if you're using any credential-related firewall
> > > rules with either ipfw or pf? These would be uid/gid/jail matching
> > > rules.
>
> As I said above, I don't use any uid/gid/jail rules. Mail with pf.conf
> and ipfw config incoming shortly after this one.
>
> > > Robert N M Watson
> > > Computer Laboratory
> > > University of Cambridge
>
> [snip]
>
> > That's quite a complex setup. It would really be interesting to get the
> > trace for the first LOR in order to figure out which code path we are
> > looking at. I have a feeling that it might be the dummynet entry point,
> > but w/o the trace this is only speculation.
>
> Working on it.
>
> > --
> > /"\ Best regards, | mlaier at freebsd.org
> > \ / Max Laier | ICQ #67774661
> > X http://pf4freebsd.love2party.net/ | mlaier at EFnet
> > / \ ASCII Ribbon Campaign | Against HTML Mail and News
>
> I'd like suggestions / comments about the kernel config I'm thinking
> about for debugging purposes:
>
> - take my KERNEL (GENERIC + IPFW - IPv6 and SCTP and wireless), and add:
>
> options WITNESS
> options WITNESS_KDB # only if debug-on-first-warn is wanted
> options WITNESS_SKIPSPIN
> options KDB
> #options KDB_TRACE # not needed since I'll trace anyway?
> options DDB
> #options BREAK_TO_DEBUGGER # would that work for my kind of lockup?
> options MSGBUF_SIZE=409600
>
>
> Ideally I would like to hear that the manual tracing and debugging with
> a keyboard console would provide enough info. I'll increase the kernel
> buffer size to 400k as above, so I don't lose info when I continue and
> dmesg > log.txt.
>
> Just as easily, I can try forcing a panic at the LORs and keeping the
> kernel dumps (with optional debugging in ddb like above). The advantage
> is that this might andswer supplementary questions after the deed is
> done.
>
> Both the above options should be possible this week.
>
> The serial console part may or may not happen this week, and I'm quite
> positive it will take another week before I find the time to spend 16+
> hours on location, waiting for a lockup (which might happen at a busy
> time and therefore I'll have very little time to do all the debugging).
>
> Tips / suggestions are most welcome!
>
> Thanks for the help!
> Alex
--
ian j hart
More information about the freebsd-stable
mailing list