HDD Lockups with on 9.1-RC3 (MPS, LSI 2008, and ZFS)
Reed A. Cartwright
cartwright at asu.edu
Tue Nov 27 19:11:22 UTC 2012
I also posted about this on freebsd-stable earlier.
I recently upgraded my server from 9.0 to 9.1-RC3 and have started
experiencing HDD lockups. They don't happen immediately, but they do
appear to be happening during heavy read-write usage. (The only other
change I did was to disable atime on one of the pools.) The system
itself is not crashed because I can sometimes log in and execute a few
commands (if the right files are cached in memory). The first time
this happened I was able to detect that many processes were stuck in
tx->tx state. I can't figure out what this means.
The lockups have occurred when I was reading from the storage pool and
writing back to either the storage pool or a ufs scratch drive.
The system reboots fine; no HDD corruption is apparent. I have yet to
find an error message associated with the lockups.
I upgraded my controller cards' firmware to match the new MPS driver
in 9.1 and the problem is still happening. It looks like my cache
drive might have out of date firmware but it requires windows or linux
to upgrade according to OCZ.
pciconf and dmesg are attached.
SYSTEM INFO
64-core machine with 512GB memory, 9.1-RC3 kernel.
uname -a:
FreeBSD herschel.biodesign.asu.edu 9.1-RC3 FreeBSD 9.1-RC3 #0 r242324:
Tue Oct 30 00:58:57 UTC 2012
root at farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
camcontrol devlist:
<ATA Hitachi HUA72202 A3EA> at scbus0 target 0 lun 0 (pass0,da0)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 1 lun 0 (pass1,da1)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 2 lun 0 (pass2,da2)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 3 lun 0 (pass3,da3)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 4 lun 0 (pass4,da4)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 5 lun 0 (pass5,da5)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 6 lun 0 (pass6,da6)
<ATA Hitachi HUA72202 A3EA> at scbus0 target 7 lun 0 (pass7,da7)
<ATA D2CSTK251M11-048 2.15> at scbus7 target 0 lun 0 (pass8,da8)
<ATA WDC WD1003FBYX-0 1V02> at scbus7 target 1 lun 0 (pass9,da9)
<ATA WDC WD2503ABYX-0 1S02> at scbus7 target 2 lun 0 (pass10,da10)
<ATA WDC WD2503ABYX-0 1S02> at scbus7 target 3 lun 0 (pass11,da11)
<ATA INTEL SSDSA2CW30 0362> at scbus7 target 4 lun 0 (pass12,da12)
<KVM vmDisk-CD 0.01> at scbus9 target 0 lun 0 (cd0,pass13)
df -kh:
Filesystem Size Used Avail Capacity Mounted on
zroot 199G 10G 189G 5% /
devfs 1.0k 1.0k 0B 100% /dev
/dev/label/scratch 275G 15G 237G 6% /scratch
fdescfs 1.0k 1.0k 0B 100% /dev/fd
procfs 4.0k 4.0k 0B 100% /proc
storage/home 8.1T 521G 7.6T 6% /home
storage/jails 7.6T 63k 7.6T 0% /jails
storage/storage 8.7T 1.1T 7.6T 13% /storage
storage/storage/tt 8.1T 478G 7.6T 6% /storage/tt
devfs 1.0k 1.0k 0B 100% /compat/linux/dev
linsysfs 4.0k 4.0k 0B 100% /compat/linux/sys
linprocfs 4.0k 4.0k 0B 100% /compat/linux/proc
zpool status:
pool: storage
state: ONLINE
scan: scrub repaired 0 in 9h21m with 0 errors on Sat Nov 17 12:23:44 2012
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
da0 ONLINE 0 0 0
da1 ONLINE 0 0 0
da2 ONLINE 0 0 0
da3 ONLINE 0 0 0
da4 ONLINE 0 0 0
da5 ONLINE 0 0 0
da6 ONLINE 0 0 0
da7 ONLINE 0 0 0
cache
da8 ONLINE 0 0 0
errors: No known data errors
pool: zroot
state: ONLINE
scan: scrub repaired 0 in 0h14m with 0 errors on Sat Nov 17 03:16:09 2012
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da11p2 ONLINE 0 0 0
da10p2 ONLINE 0 0 0
cat /boot/loader.conf:
zfs_load="YES"
geom_eli_load="YES"
ahci_load="YES"
vfs.root.mountfrom="zfs:zroot"
debug.acpi.max_tasks="128"
#vboxdrv_load="YES"
kern.maxfiles="65536"
--
Reed A. Cartwright, PhD
Assistant Professor of Genomics, Evolution, and Bioinformatics
School of Life Sciences
Center for Evolutionary Medicine and Informatics
The Biodesign Institute
Arizona State University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.log
Type: application/octet-stream
Size: 18137 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-scsi/attachments/20121127/ff747777/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pciconf.log
Type: application/octet-stream
Size: 17091 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-scsi/attachments/20121127/ff747777/attachment-0001.obj>
More information about the freebsd-scsi
mailing list