amd64/125873: Repeated kernel panics, trap 12 page fault while in kernel mode (always with smbd).

Sean Cody sean at franticfilms.com
Tue Jul 22 15:30:03 UTC 2008


>Number:         125873
>Category:       amd64
>Synopsis:       Repeated kernel panics, trap 12 page fault while in kernel mode (always with smbd).
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jul 22 15:30:01 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Sean Cody
>Release:        7.0-RELEASE
>Organization:
Frantic Films VFX Services Inc.
>Environment:
FreeBSD deadline-la.franticfilms.com 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008     root at driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
Machine panics almost daily under heavy Samba usage.  We have a machine which we recently converted to a FreeBSD 7 Box whose sole purpose in existence is to deal with a product which communicates over a disk based queue served up via CIFS to clients.  This machine is pretty heavily loaded with these requests and shortly after putting the machine into production the machine would crash daily (sometimes more than once) with the very same characteristics.  

We've swapped the drives to another machine with the same results.  The panic drops a number for cores and each one's backtrace shows corrupted stack.  We've not tried booting without ACPI support though I have ruled out memory and temperature issues by swapping hardware around and keeping an eye on the environment.

Here is a simple back trace from the panic's core.

deadline-la# cat info.9
Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 170438656B (162 MB)
  Blocksize: 512
  Dumptime: Fri Jul 18 17:25:32 2008
  Hostname: deadline-la.franticfilms.com
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008
    root at driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
  Panic String: page fault
  Dump Parity: 113022818
  Bounds: 9
  Dump Status: good


deadline-la# kgdb /boot/kernel/kernel /var/crash/vmcore.9 
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x0
fault code		= supervisor read data, page not present
instruction pointer	= 0x8:0xffffffff80468dff
stack pointer	        = 0x10:0xffffffffa2c339c0
frame pointer	        = 0x10:0xffffff0013e4f8d0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 34088 (smbd)
trap number		= 12
panic: page fault
cpuid = 0
Uptime: 8h13m17s
Physical memory: 1011 MB
Dumping 162 MB: 147 131 115 99 83 67 51 35 19 3

#0  doadump () at pcpu.h:194
194	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff8072ec94 in trap_fatal (frame=0xffffff00152b9000, eva=18446742974531694592) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff8072f90f in trap (frame=0xffffffffa2c33910) at /usr/src/sys/amd64/amd64/trap.c:251
#6  0xffffffff8071560e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169
#7  0xffffffff80468dff in lf_advlock (ap=Variable "ap" is not available.
) at /usr/src/sys/kern/kern_lockf.c:294
#8  0xffffffff8044ec5b in kern_fcntl (td=0xffffff00152b9000, fd=Variable "fd" is not available.
) at vnode_if.h:1036
#9  0xffffffff8044f01f in fcntl (td=0xffffff00152b9000, uap=0xffffffffa2c33be0) at /usr/src/sys/kern/kern_descrip.c:336
#10 0xffffffff8072f2e7 in syscall (frame=0xffffffffa2c33c70) at /usr/src/sys/amd64/amd64/trap.c:852
#11 0xffffffff8071581b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290
#12 0x0000000801af707c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

dmesg of the machine:
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008
    root at driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.20-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0xf41  Stepping = 1
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNXT-ID,CX16,xTPR>
  AMD Features=0x20000800<SYSCALL,LM>
usable memory = 1060724736 (1011 MB)
avail memory  = 1022083072 (974 MB)
ACPI APIC Table: <DELL   PE BKC  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0: Changing APIC ID to 7
ioapic1: Changing APIC ID to 8
ioapic2: Changing APIC ID to 9
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 32-55 on motherboard
ioapic2 <Version 2.0> irqs 64-87 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
hptrr: HPT RocketRAID controller driver v1.1 (Feb 24 2008 10:34:18)
acpi0: <DELL PE BKC> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
cpu0: <ACPI CPU> on acpi0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
cpu1: <ACPI CPU> on acpi0
p4tcc1: <CPU Frequency Thermal Control> on cpu1
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
mpt0: <LSILogic 1030 Ultra4 Adapter> port 0xec00-0xecff mem 0xdfdf0000-0xdfdfffff,0xdfde0000-0xdfdeffff irq 34 at device 5.0 on pci3
mpt0: [ITHREAD]
mpt0: MPI Version=1.2.12.0
pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
em0: <Intel(R) PRO/1000 Network Connection Version - 6.7.3> port 0xdcc0-0xdcff mem 0xdfae0000-0xdfafffff irq 64 at device 7.0 on pci6
em0: Ethernet address: 00:11:43:d7:00:ac
em0: [FILTER]
pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
pci7: <ACPI PCI bus> on pcib7
em1: <Intel(R) PRO/1000 Network Connection Version - 6.7.3> port 0xccc0-0xccff mem 0xdf8e0000-0xdf8fffff irq 65 at device 8.0 on pci7
em1: Ethernet address: 00:11:43:d7:00:ad
em1: [FILTER]
pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci8: <ACPI PCI bus> on pcib8
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xace0-0xacff irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xacc0-0xacdf irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xaca0-0xacbf irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem 0xdff00000-0xdff003ff irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3
uhub3: 6 ports with 6 removable, self powered
uhub4: <vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, addr 2> on uhub3
uhub4: multiple transaction translators
uhub4: 2 ports with 2 removable, self powered
pcib9: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci9: <ACPI PCI bus> on pcib9
vgapci0: <VGA-compatible display> port 0xbc00-0xbcff mem 0xd0000000-0xd7ffffff,0xdf5f0000-0xdf5fffff irq 18 at device 13.0 on pci9
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse Explorer, device ID 4
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio0: [FILTER]
orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xcbfff,0xcc000-0xcffff,0xd0000-0xd0fff,0xec000-0xeffff on isa0
ppc0: cannot reserve I/O port range
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
hptrr: no controller detected.
Waiting 5 seconds for SCSI devices to settle
acd0: CDROM <TEAC CD-ROM CD-224E/K.9A> at ata0-master UDMA33
ses0 at mpt0 bus 0 target 6 lun 0
ses0: <PE/PV 1x2 SCSI BP 1.0> Fixed Processor SCSI-2 device 
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
SMP: AP CPU #1 Launched!
da0 at mpt0 bus 0 target 0 lun 0
da0: <FUJITSU MAT3073NC 5704> Fixed Direct Access SCSI-3 device 
da0: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
da0: Command Queueing Enabled
da0: 70007MB (143374650 512 byte sectors: 255H 63S/T 8924C)
Trying to mount root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
em0: link state changed to UP
rtfree: 0xffffff00017b14b0 has 1 refs
rtfree: 0xffffff000c3520f0 has 1 refs

I've got 10 of these cores and all show very similar back traces and I'm not sure what to try next (save for disabling ACPI which may not be the problem).
>How-To-Repeat:
Just let the machine run and serve up a heavy load of SMB traffic.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-amd64 mailing list