5.1-R-p2 crashes on SMP with AMI RAID and Intel 1000/Pro
Hartmann, O.
ohartman at klima.physik.uni-mainz.de
Wed Aug 13 01:59:22 PDT 2003
Dear Sirs.
It seems to me a never ending story. We run a box with a TYAN Thunder
2500 Dual SMP mainboard, 2GB ECC Tyan certified memory, AMI Enterprise
1600 RAID adapter and additional Intel 1000/Pro server type (64 bit)
GBit LAN NIC. With FreeBSD 4.8 this was stable, but to achive this
state was really hard! It is a story similar to that what happend when
we changed towards FreeBSD 5.1-RELEASE-p2 on this machine.
It seems to be highly dependend in which PCI slot several cards are
attached, so I will report this here also.
Phenomenon:
After a while the machine was running, the SMP kernel reboots
spontanously. This is when heavy IO is done, compiling or, when in the
morning time our department gets up and our staff connects to the samba
server.
Dependend on which devices are switched on or off by BIOS, the kernel
freezes at the stage when the amr0 RAID got recognized. I can avoid this
by enabling the built in NIC (fxp0). I can force this by putting the em0
NIC into another slot, for instance in the one remaining 64BIT/66MHz
slot (which should be a separate bus).
This 'game' was identical to that I had with FreeBSD 4.X - 4.8 and I
found out, that putting an additional NIC into PCI slot No. 2 (counted
from AGP slot on) made things clear, but using both NICs together
(either additional fxp0 or the new em0) remains the systems completely
unstable.
In FreeBSD 5.1-RELEASE-p2 and especially in FreeBSD 5.1-CURRENT this
'gambling' seems to reach its climax. My kernel is built up with
SCHED_4BSD because SCHED_ULE and ADAPTIVE_MUTEXES crashes immediately
the same way as described (running a while, then coredumping or freeze
at the stage after the amr0-RAID showed up in the kernel boot messages,
see the dmesg output below).
I'm not an hardware expert, but all this wierd stuff looks like to me to be
a IRQ routing problem. I fiddled around with many hand-assigned IRQ configurations,
but nothing helped. Either the Intel 1000/Pro or the AMI RAID causing
problems in the TYAN Thunder 2500 SMP environment.
We have also a SMP machine with a similar hardware, based on an ASUS CV4X-D,
AMI Elite 1600 RAID controller and the same Intel em0 1GBit NIC. OS is
FreeBSD 4.8 and this system never had any problem!
I feel a little bit helpless this moment, because I think I tried every trick
and something seems to be wrong with the combination TYAN Thunder 2500 and FreeBSD
5.X SMP. It is also very courios, that a kernel without SMP/IO_APIC freezes after
booting at the same place (amr0 RAID recognition).
Is there any help outside?
I attach the kernel config file and the dmesg output. Please note: I disabled both
serial ports, the parallel port, sound and usb to get additional IRQs. But I have to
enable the built in NIC to get a bootable, but instable FreeBSD 5.1-R box.
====================================
DMESG output
====================================
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.1-RELEASE-p2 #14: Wed Aug 13 09:47:00 CEST 2003
root at atmos.physik.uni-mainz.de:/usr/obj/usr/src/sys/ATMOS
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0458000.
Timecounter "i8254" frequency 1193182 Hz
Timecounter "TSC" frequency 868644793 Hz
CPU: Intel Pentium III (868.64-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x683 Stepping = 3
Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory = 2147483648 (2048 MB)
avail memory = 2085625856 (1989 MB)
Programming 16 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
Programming 16 pins in IOAPIC #1
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000
cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000
io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000
io1 (APIC): apic id: 3, version: 0x000f0011, at 0xfec01000
netsmb_dev: loaded
Pentium Pro MTRR support enabled
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcibios: BIOS version 2.10
Using $PIR table, 12 entries at 0xc00fdf00
pcib0: <Host to PCI bridge> at pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
IOAPIC #1 intpin 13 -> irq 2
IOAPIC #1 intpin 12 -> irq 16
IOAPIC #1 intpin 2 -> irq 17
IOAPIC #1 intpin 7 -> irq 18
pcib1: <PCI-PCI bridge> at device 0.1 on pci0
pci1: <PCI bus> on pcib1
IOAPIC #1 intpin 1 -> irq 19
pci1: <display, VGA> at device 0.0 (no driver attached)
sym0: <896> port 0xf800-0xf8ff mem 0xfeafe000-0xfeafffff,0xfeafac00-0xfeafafff irq 2 at device 1.0 on pci0
sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: handling phase mismatch from SCRIPTS.
sym1: <896> port 0xf400-0xf4ff mem 0xfeafc000-0xfeafdfff,0xfeafa800-0xfeafabff irq 16 at device 1.1 on pci0
sym1: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking
sym1: open drain IRQ line driver, using on-chip SRAM
sym1: using LOAD/STORE-based firmware.
sym1: handling phase mismatch from SCRIPTS.
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.5.31> port 0xfcc0-0xfcff mem 0xfeac0000-0xfeadffff irq 17 at device 4.0 on pci0
em0: Speed:1000 Mbps Duplex:Full
fxp0: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port 0xfc40-0xfc7f mem 0xfe900000-0xfe9fffff,0xfeaf9000-0xfeaf9fff irq 18 at device 7.0 on pci0
fxp0: Ethernet address 00:e0:81:00:f0:d7
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isab0: <PCI-ISA bridge> port 0x500-0x50f at device 15.0 on pci0
isa0: <ISA bus> on isab0
pci0: <mass storage, ATA> at device 15.1 (no driver attached)
pcib2: <ServerWorks host to PCI bridge> at pcibus 2 on motherboard
pci2: <PCI bus> on pcib2
pcib3: <PCI-PCI bridge> at device 2.0 on pci2
pci3: <PCI bus> on pcib3
IOAPIC #1 intpin 11 -> irq 20
IOAPIC #1 intpin 8 -> irq 21
pcib4: <PCI-PCI bridge> at device 0.0 on pci3
pci4: <PCI bus> on pcib4
IOAPIC #1 intpin 10 -> irq 22
amr0: <LSILogic MegaRAID> mem 0xf0000000-0xf3ffffff irq 22 at device 0.0 on pci4
amr0: <LSILogic MegaRAID Enterprise 1600> Firmware G170, BIOS F316, 64MB RAM
pci3: <mass storage, SCSI> at device 1.0 (no driver attached)
pci3: <mass storage, SCSI> at device 2.0 (no driver attached)
orm0: <Option ROMs> at iomem 0xca000-0xcdfff,0xc0000-0xc9fff on isa0
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model IntelliMouse, device ID 3
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 8250 or not responding
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
ppc0: parallel port not found.
unknown: <PNP0303> can't assign resources (port)
psmcpnp0: irq resource info is missing; assuming irq 12
unknown: <PNP0700> can't assign resources (port)
ppc1: parallel port not found.
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2
APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
Timecounters tick every 1.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited
DUMMYNET initialized (011031)
Waiting 5 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 245014MB (501788672 sectors) RAID 5 (optimal)
===> freezing here!
sa0 at sym1 bus 0 target 5 lun 0
sa0: <HP C5713A H910> Removable Sequential Access SCSI-2 device
sa0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
ch0 at sym1 bus 0 target 5 lun 1
ch0: <HP C5713A H910> Removable Changer SCSI-2 device
ch0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
ch0: 6 slots, 1 drive, 0 pickers, 0 portals
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/amrd0s1a
cd0 at sym0 bus 0 target 3 lun 0
cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device
cd0: 20.000MB/s transfers (20.000MHz, offset 16)
cd0: Attempt to query device size failed: NOT READY, Medium not present
========================
KERNEL config file
========================
machine i386
cpu I686_CPU
ident ATMOS
options SMP # Symmetric MultiProcessor Kernel
options APIC_IO # Symmetric (APIC) I/O
maxusers 0
hints "ATMOS.hints" #Default places to look for devices.
#options COMPAT_FREEBSD4
options SCHED_4BSD #4BSD scheduler
#options SCHED_ULE
#options ADAPTIVE_MUTEXES
#options PQ_CACHESIZE=256
options CPU_ENABLE_SSE
options CLK_USE_TSC_CALIBRATION
#options HZ=1000
#makeoptions CONF_CFLAGS=-fno-builtin
#options MAXDSIZ=(1024UL*1024*1024)
#options MAXSSIZ=(128UL*1024*1024)
#options DFLDSIZ=(1024UL*1024*1024)
options GEOM_AES
options GEOM_APPLE
options GEOM_BDE
options GEOM_BSD
options GEOM_GPT
options GEOM_MBR
options GEOM_PC98
options GEOM_SUNLABEL
options GEOM_VOL
options ROOTDEVNAME=\"ufs:amrd0s1a\"
options INET #InterNETworking
#options INET6 #IPv6 communications protocols
options FFS #Berkeley Fast Filesystem
options SOFTUPDATES #Enable FFS soft updates support
options UFS_ACL #Support for access control lists
options UFS_DIRHASH #Improve performance on big directories
options NFSCLIENT #Network Filesystem Client
options NFSSERVER #Network Filesystem Server
options MSDOSFS #MSDOS Filesystem
options CD9660 #ISO 9660 Filesystem
options PROCFS #Process filesystem (requires PSEUDOFS)
options PSEUDOFS #Pseudo-filesystem framework
options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!]
options SCSI_DELAY=5000 #Delay (in ms) before probing SCSI
options SYSVSHM #SYSV-style shared memory
options SYSVMSG #SYSV-style message queues
options SYSVSEM #SYSV-style semaphores
options NETSMB
options NETSMBCRYPTO
options LIBMCHAIN
options LIBICONV
#options WATCHDOG
options NETGRAPH
#options NETGRAPH_ASYNC
#options NETGRAPH_BPF
#options NETGRAPH_BRIDGE
#options NETGRAPH_CISCO
#options NETGRAPH_ECHO
#options NETGRAPH_ETHER
#options NETGRAPH_FRAME_RELAY
#options NETGRAPH_GIF
#options NETGRAPH_GIF_DEMUX
#options NETGRAPH_HOLE
#options NETGRAPH_IFACE
#options NETGRAPH_IP_INPUT
#options NETGRAPH_KSOCKET
#options NETGRAPH_L2TP
#options NETGRAPH_LMI
#options NETGRAPH_MPPC_ENCRYPTION
#options NETGRAPH_ONE2MANY
#options NETGRAPH_PPP
#options NETGRAPH_PPPOE
#options NETGRAPH_PPTPGRE
#options NETGRAPH_RFC1490
#options NETGRAPH_SOCKET
#options NETGRAPH_SPLIT
#options NETGRAPH_TEE
#options NETGRAPH_TTY
#options NETGRAPH_UI
#options NETGRAPH_VJC
options MROUTING
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_FORWARD
#options IPFIREWALL_VERBOSE_LIMIT=100
#options IPFIREWALL_DEFAULT_TO_ACCEPT
#options IPV6FIREWALL
#options IPV6FIREWALL_VERBOSE
#options IPV6FIREWALL_VERBOSE_LIMIT=100
#options IPV6FIREWALL_DEFAULT_TO_ACCEPT
options IPDIVERT
#options IPFILTER
#options IPFILTER_LOG
#options IPFILTER_DEFAULT_BLOCK
options IPSTEALTH
options RANDOM_IP_ID
options ACCEPT_FILTER_DATA
#options ACCEPT_FILTER_HTTP
options TCP_DROP_SYNFIN
options DUMMYNET
#options BRIDGE
options QUOTA
options _KPOSIX_PRIORITY_SCHEDULING
options P1003_1B_SEMAPHORES
#options MAC
#options MAC_BIBA
#options MAC_BSDEXTENDED
#options MAC_DEBUG
#options MAC_IFOFF
#options MAC_LOMAC
#options MAC_MLS
#options MAC_NONE
#options MAC_PARTITION
#options MAC_SEEOTHERUIDS
#options MAC_TEST
options KBD_INSTALL_CDEV # install a CDEV entry in /dev
device isa
#options AUTO_EOI_1
device pci
device agp
# Floppy drives
device fdc
# SCSI Controllers
device sym # NCR/Symbios Logic (newer chipsets + those of `ncr')
#device ahc
# SCSI peripherals
device scbus # SCSI bus (required)
device ch # SCSI media changers
device da # Direct Access (disks)
device sa # Sequential Access (tape etc)
device cd # CD
device pass # Passthrough device (direct SCSI access)
device ses # SCSI Environmental Services (and SAF-TE)
# RAID controllers
device amr # AMI MegaRAID
#options CHANGER_MIN_BUSY_SECONDS=2
#options CHANGER_MAX_BUSY_SECONDS=10
#options SA_IO_TIMEOUT=4
#options SA_SPACE_TIMEOUT=60
#options SA_REWIND_TIMEOUT=(2*60)
#options SA_ERASE_TIMEOUT=(4*60)
#options SA_1FM_AT_EOD
#options SCSI_PT_DEFAULT_TIMEOUT=60
options SES_ENABLE_PASSTHROUGH
# atkbdc0 controls both the keyboard and the PS/2 mouse
device atkbdc # AT keyboard controller
device atkbd # AT keyboard
options ATKBD_DFLT_KEYMAP
makeoptions ATKBD_DFLT_KEYMAP=us.iso
device psm # PS/2 mouse
device vga # VGA video card driver
device splash # Splash screen and screen saver support
# syscons is the default console driver, resembling an SCO console
device sc
options MAXCONS=8
#options SC_ALT_MOUSE_IMAGE
options SC_DFLT_FONT
makeoptions SC_DFLT_FONT=cp850
options SC_DISABLE_DDBKEY
options SC_DISABLE_REBOOT
options SC_HISTORY_SIZE=512
#options SC_MOUSE_CHAR=0x3
options SC_PIXEL_MODE
options SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
options SC_NORM_REV_ATTR=(FG_YELLOW|BG_GREEN)
options SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK)
options SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED)
#options SC_CUT_SPACES2TABS
#options SC_CUT_SEPCHARS=\"x09\"
#options SC_TWOBUTTON_MOUSE
#options SC_NO_CUTPASTE
#options SC_NO_FONT_LOADING
#options SC_NO_HISTORY
#options SC_NO_SYSMOUSE
#options SC_NO_SUSPEND_VTYSWITCH
device npx
#device pmtimer
#device sio # 8250, 16[45]50 based serial ports
# Parallel port
#device ppc
#device ppbus # Parallel port bus (required)
#device lpt # Printer
#device plip # TCP/IP over parallel
#device ppi # Parallel port interface device
#device vpo # Requires scbus and da
device miibus # MII bus support
device em
#device fxp # Intel EtherExpress PRO/100B (82557, 82558)
device random # Entropy device
device loop # Network loopback
device ether # Ethernet support
#device tun # Packet tunnel.
device pty # Pseudo-ttys (telnet etc)
#device gif # IPv6 and IPv4 tunneling
#device faith # IPv6-to-IPv4 relaying (translation)
device bpf # Berkeley packet filter
------------------
Thanks a lot for your help,
Oliver
--
MfG
O. Hartmann
ohartman at mail.physik.uni-mainz.de
------------------------------------------------------------------
Systemadministration des Institutes fuer Physik der Atmosphaere (IPA)
------------------------------------------------------------------
Johannes Gutenberg Universitaet Mainz
Becherweg 21
55099 Mainz
Tel: +496131/3924662 (Maschinenraum)
Tel: +496131/3924144 (Buero)
FAX: +496131/3923532
More information about the freebsd-stable
mailing list