7.0-RELEASE && panic after ~4 hours

Kris Kennaway kris at FreeBSD.org
Tue Apr 1 02:57:41 PDT 2008


Matthias Apitz wrote:
> El día Monday, March 31, 2008 a las 03:04:04PM +0200, Matthias Apitz escribió:
> 
>>> You should unmount (or boot to single-user mode) and run a full fsck 
>>> (fsck -fy).
>> Thanks for your hint and I've done what you have advised and I'm
>> connected through Wifi again now (until next panic :-))
>> The fsck has indeed correct something where the block count should have
>> been zero but was some decimal number of 20 digits, I think (don't
>> remember that large number);
>>
>> I'll copy this e-mail into the TT;
> 
> While the laptop worked all night at home (and only with clean shutdows
> since the last 'fsck -fy' yesterday afternoon), it crashed after around
> 20 minutes in my office this morning; the kgdb says:
> 
> # kgdb /boot/kernel/kernel vmcore.3
> [GDB will not be able to debug user-mode threads:
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "i386-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0xffff1a18
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc07fa3ee
> stack pointer           = 0x28:0xe6904aa4
> frame pointer           = 0x28:0xe6904ad0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 1546 (gkrellm)
> trap number             = 12
> panic: page fault
> cpuid = 0
> Uptime: 16m21s
> Physical memory: 1009 MB
> Dumping 156 MB: (CTRL-C to abort)  141 125 109 93 (CTRL-C to abort)  77
> 61 45 29 (CTRL-C to abort)  13
> 
> #0  doadump () at pcpu.h:195
> 195     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:195
> #1  0xc0754457 in boot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:409
> #2  0xc0754719 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:563
> #3  0xc0a4905c in trap_fatal (frame=0xe6904a64, eva=4294908440) at
> /usr/src/sys/i386/i386/trap.c:899
> #4  0xc0a492e0 in trap_pfault (frame=0xe6904a64, usermode=0,
> eva=4294908440)
>     at /usr/src/sys/i386/i386/trap.c:812
> #5  0xc0a49c8c in trap (frame=0xe6904a64) at
> /usr/src/sys/i386/i386/trap.c:490
> #6  0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> #7  0xc07fa3ee in rt_msg2 (type=12, rtinfo=0xe6904b04, cp=0x0,
> w=0xe6904b34)
>     at /usr/src/sys/net/rtsock.c:784
> #8  0xc07fb1a5 in sysctl_rtsock (oidp=0xc0b84ac0, arg1=0xe6904c1c,
> arg2=4, req=0xe6904ba4)
>     at /usr/src/sys/net/rtsock.c:1153
> #9  0xc075dc97 in sysctl_root (oidp=Variable "oidp" is not available.
> ) at /usr/src/sys/kern/kern_sysctl.c:1306
> #10 0xc075dde4 in userland_sysctl (td=0xc472fc60, name=0xe6904c14,
> namelen=6, old=0x0, 
>     oldlenp=0xbfbfe478, inkernel=0, new=0x0, newlen=0,
> retval=0xe6904c10, flags=0)
>     at /usr/src/sys/kern/kern_sysctl.c:1401
> #11 0xc075eb7e in __sysctl (td=0xc472fc60, uap=0xe6904cfc) at
> /usr/src/sys/kern/kern_sysctl.c:1336
> #12 0xc0a49635 in syscall (frame=0xe6904d38) at
> /usr/src/sys/i386/i386/trap.c:1035
> #13 0xc0a2fc70 in Xint0x80_syscall () at
> /usr/src/sys/i386/i386/exception.s:196
> #14 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) 
> 
> any advice to track this down? thanks in advance

OK, now this panic looks more reasonable :)  At a guess, sysctl_rtsock 
is not protecting against a structure changing in mid-operation (e.g. 
missing locking).  It might be a straightforward fix for someone 
familiar with the code.  Try raising it on net@ and filing a new PR 
(since it's not related to your other one).

Kris


More information about the freebsd-mobile mailing list