Terrible disk performance with LSI / FreeBSD 9.2-RC1
J David
j.david.lists at gmail.com
Wed Aug 7 02:48:20 UTC 2013
We have a machine running 9.2-RC1 that's getting terrible disk I/O
performance. Its performance has always been pretty bad, but it
didn't really become clear how bad until we did a zpool replace on one
of the drives and realized it was going to take 3 weeks to rebuild a
<1TB drive.
The hardware specs are:
- 2 x Xeon L5420
- 32 GiB RAM
- LSI Logic SAS 1068E
- 2 x 32GB SSD's
- 6 x 1TB Western Digital RE3 7200RPM SATA
The LSI controller has the most recent firmware I'm aware of
(6.36.00.00 / 1.33.00.00 dated 2011.08.24), is in IT mode, and appears
to be working fine:
mpt0 Adapter:
Board Name: USASLP-L8i
Board Assembly: USASLP-L8i
Chip Name: C1068E
Chip Revision: B3
RAID Levels: none
mpt0 Configuration: 0 volumes, 8 drives
drive da0 (30G) ONLINE <FTM32GL25H 10> SATA
drive da1 (29G) ONLINE <SSDSA2SH032G1GN 8860> SATA
drive da2 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
drive da3 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
drive da4 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
drive da5 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
drive da6 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
drive da7 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
The eight drives are configured as ZIL, L2ARC on SSD and a six drive
raidz2 on the spinning disks.
We did a ZFS replace on the last drive in the line, and the resilver
is proceeding at less than 800k/sec.
extended device statistics
device r/s w/s kr/s kw/s qlen svc_t %b
da0 0.0 0.0 0.0 0.1 0 0.9 0
da1 0.0 8.2 0.0 19.9 0 0.1 0
da2 125.6 23.0 768.2 40.5 4 33.0 88
da3 126.6 23.1 769.0 41.3 4 32.3 89
da4 126.0 24.0 768.5 42.7 4 32.1 88
da5 125.9 22.0 768.2 40.1 4 31.6 87
da6 124.0 22.0 766.6 39.9 5 31.4 84
da7 0.0 136.9 0.0 801.3 0 0.6 4
The system has plenty of free RAM, is 99.7% idle, has nothing else
going on, and runs like a one-legged dog.
There are no error messages or any sign of a problem anywhere, other
than the really terrible performance. (When not rebuilding, it does
light NFS duty. That performance is similarly bad, but has never
really mattered.)
Similar systems running Solaris put out 10x these numbers claiming 30%
busy instead of 90% busy.
Does anyone have any suggestions for how I could troubleshoot this
further? At this point, I'm kind of at a loss as to where to go from
here. My goal is to try to phase out the Solaris machines, but this
is kind of a roadblock.
Thanks for any advice!
More information about the freebsd-questions
mailing list