panic: bufwrite: buffer is not busy???
John Baldwin
jhb at freebsd.org
Mon Jan 31 20:43:47 UTC 2011
On Monday, January 31, 2011 1:31:54 am Eugene Grosbein wrote:
> On 15.01.2011 01:37, John Baldwin wrote:
> > On Friday, January 14, 2011 1:44:19 pm Eugene Grosbein wrote:
> >> On 14.01.2011 18:46, Mike Tancsa wrote:
> >>
> >>>> I'm using mpd 5.5 on three PPPoE routers, each servicing about 300
PPPoE
> >>>> concurrent sessions. Routers are based on Intel SR1630GP hardware
platforms and
> >>>> runs FreeBSD 7.3-RELEASE.
> >>>>
> >>>> I'm experiencing stability issues related to Netgraph. None of above
routers can
> >>>> survive more than 20-30 days of uptime under typical load. There are
different
> >>>> flavors of kernel panics, but all are somehow related to netgraph.
Typical
> >>>> backtraces follow
> >>>
> >>> I also have stability issues on RELENG_8.
> >>>
> >>> http://www.freebsd.org/cgi/query-pr.cgi?pr=153497
> >>
> >> And for one of my servers (8.2-PRERELEASE/amd64 with 4GB RAM) I just
cannot obtain crashdump,
> >> it cannot finish to write it. For example, it happened an hour ago:
> >>
> >> Fatal trap 12: page fault while in kernel mode
> >> cpuid = 2; apic id = 04
> >> fault virtual address = 0x200000040
> >> fault code = supervisor read data, page not present
> >> instruction pointer = 0x20:0xffffffff803cc979
> >
> > Assuming your kernel is built with debug symbols (which is the default),
one
> > thing you can do to aid in debugging is this:
> >
> > gdb /boot/kernel/kernel
> > (gdb) l *0xffffffff803cc979
> >
> > Where the 0xfff<blah> bit is the part of the 'instruction pointer' value
> > above after the colon (:) and then send the output of that in your e-mail
to
> > the list. This allows us to the source line at which the fault occurred.
> >
>
> Yesterday I've got another kernel panic of this kind with RELENG_8 updated
20 January
> and it still could not finish writing of crashdump:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 02
> fault virtual address = 0x200000030
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff803c1315
> stack pointer = 0x28:0xffffff8000130780
> frame pointer = 0x28:0xffffff80001307a0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 12 (irq259: em1:rx 0)
> trap number = 12
> panic: page fault
> cpuid = 1
> Uptime: 19h41m8s
> Dumping 4087 MB (3 chunks)
> chunk 0: 1MB (150 pages) ... ok
> chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not
busy???
> cpuid = 1
> Uptime: 19h41m9s
> Automatic reboot in 15 seconds - press a key on the console to abort
>
> This time I have all debug symbols handy:
>
>
> # gdb kernel
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> (gdb) l *0xffffffff803c1315
> 0xffffffff803c1315 is in ng_address_hook
(/home/src/sys/netgraph/ng_base.c:3504).
> 3499 * Quick sanity check..
> 3500 * Since a hook holds a reference on it's node, once we know
> 3501 * that the peer is still connected (even if invalid,) we
know
> 3502 * that the peer node is present, though maybe invalid.
> 3503 */
> 3504 if ((hook == NULL) ||
> 3505 NG_HOOK_NOT_VALID(hook) ||
> 3506 NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
> 3507 NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) {
> 3508 NG_FREE_ITEM(item);
Hmmm. I think you might have a hardware problem. Notice the fault address,
it is 0x200000030. Can you do 'x/i <instruction pointer>'? I suspect it is
doing a memory access from that has a constant offset of 0x30, in which case
the original pointer was 0x200000000, meaning it would be NULL except it has a
single-bit error. That would likely be caused by a hardware issue such as
failing RAM, etc.
--
John Baldwin
More information about the freebsd-net
mailing list