Solved persistent stalls in 'zio->io_cv)'
Trent Nelson
trent at snakebite.org
Sun Nov 25 19:56:19 UTC 2012
[ For the archives... ]
Over the course of about a month I noticed progressively worse
performance on my main FreeBSD ZFS box. It's a simple box; 1U,
root-on-zfs, 2TB WD EARS zpool mirror.
Almost everything started stalling towards the end. A simple ls
would take 15-20 seconds to return, with ^T always indicating the
wait state was 'zio->io_cv)'.
smartctl/smartmond had detected two offline uncorrectable sectors on
one of the drives, but subsequent full scans revealed no problems.
I vaguely recall reading an old mailing list post that mentioned
zio->io_cv was essentially the zfs equivalent to biord, i.e. "the
system is waiting for the disk". Coupled with knowledge that a
zpool mirror will equally split disk load between physical devices
regardless of underlying device performance (i.e. it doesn't grok
"disk 1 takes 600% longer to return my reads() than disk 0"), so
I decided to run `diskinfo -tv` on each disk, not being able to
think of anything else to run that would be useful, diagnostically.
The output was pretty telling (pasted in full at the end of this
e-mail):
ada1:
Seek Times:
Full stroke: 250 iter in 38.386551 sec = 153.546 msec
Half stroke: 250 iter in 32.733690 sec = 130.935 msec
...
Transfer Rates:
outside: 102400 kbytes in 10.554800 sec = 9702 kbytes/sec
middle: 102400 kbytes in 2.299292 sec = 44535 kbytes/sec
ada0:
Seek Times:
Full stroke: 250 iter in 13.585292 sec = 54.341 msec
Half stroke: 250 iter in 6.156059 sec = 24.624 msec
...
Transfer Rates:
outside: 102400 kbytes in 2.234584 sec = 45825 kbytes/sec
middle: 102400 kbytes in 4.405934 sec = 23241 kbytes/sec
So, I concluded (or rather, hoped) that the abysmal performance
was due to ada1 responding exponentially slower to requests than
its zpool mirror counterpart ada0.
I didn't have a replacement available, but performance was so bad
I decided to break the zpool mirror and try simply limping along on
a single disk until a replacement arrived (drive is still under WD
warranty, so sent it in for an RMA replacement).
Performance has been fine ever since (been about a week now on the
single disk).
Just posting for the benefit of the archives. There are lots of
hits for 'zio->io_cv)', but nothing related to the particulars in
this case.
Trent.
Full diskinfo output just prior to splitting the mirror:
[root at hydrogen/ttypts/3(~)#] diskinfo -tv ada1
ada1
512 # sectorsize
2000398934016 # mediasize in bytes (1.8T)
3907029168 # mediasize in sectors
4096 # stripesize
0 # stripeoffset
3876021 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
WD-WMAZA5463028 # Disk ident.
Seek times:
Full stroke: 250 iter in 38.386551 sec = 153.546 msec
Half stroke: 250 iter in 32.733690 sec = 130.935 msec
Quarter stroke: 500 iter in 30.632886 sec = 61.266 msec
Short forward: 400 iter in 53.067122 sec = 132.668 msec
Short backward: 400 iter in 6.971919 sec = 17.430 msec
Seq outer: 2048 iter in 0.332570 sec = 0.162 msec
Seq inner: 2048 iter in 0.921388 sec = 0.450 msec
Transfer rates:
outside: 102400 kbytes in 10.554800 sec = 9702 kbytes/sec
middle: 102400 kbytes in 2.299292 sec = 44535 kbytes/sec
inside: 102400 kbytes in 7.301463 sec = 14025 kbytes/sec
[root at hydrogen/ttypts/3(~)#] diskinfo -tv ada0
ada0
512 # sectorsize
2000398934016 # mediasize in bytes (1.8T)
3907029168 # mediasize in sectors
4096 # stripesize
0 # stripeoffset
3876021 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
WD-WMAZA5467587 # Disk ident.
Seek times:
Full stroke: 250 iter in 13.585292 sec = 54.341 msec
Half stroke: 250 iter in 6.156059 sec = 24.624 msec
Quarter stroke: 500 iter in 8.901942 sec = 17.804 msec
Short forward: 400 iter in 5.421932 sec = 13.555 msec
Short backward: 400 iter in 3.237963 sec = 8.095 msec
Seq outer: 2048 iter in 0.636089 sec = 0.311 msec
Seq inner: 2048 iter in 0.231709 sec = 0.113 msec
Transfer rates:
outside: 102400 kbytes in 2.234584 sec = 45825 kbytes/sec
middle: 102400 kbytes in 4.405934 sec = 23241 kbytes/sec
inside: 102400 kbytes in 3.050623 sec = 33567 kbytes/sec
Interestingly enough, I just re-ran diskinfo again then on the
remaining disk, and results are significantly better than those
above:
[root at hydrogen/ttypts/0(~)#] diskinfo -tv ada0
ada0
512 # sectorsize
2000398934016 # mediasize in bytes (1.8T)
3907029168 # mediasize in sectors
4096 # stripesize
0 # stripeoffset
3876021 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
WD-WMAZA5467587 # Disk ident.
Seek times:
Full stroke: 250 iter in 7.037776 sec = 28.151 msec
Half stroke: 250 iter in 5.805128 sec = 23.221 msec
Quarter stroke: 500 iter in 7.452755 sec = 14.906 msec
Short forward: 400 iter in 3.966540 sec = 9.916 msec
Short backward: 400 iter in 1.255183 sec = 3.138 msec
Seq outer: 2048 iter in 0.245077 sec = 0.120 msec
Seq inner: 2048 iter in 0.212632 sec = 0.104 msec
Transfer rates:
outside: 102400 kbytes in 1.286970 sec = 79567 kbytes/sec
middle: 102400 kbytes in 1.062096 sec = 96413 kbytes/sec
inside: 102400 kbytes in 1.996499 sec = 51290 kbytes/sec
--
http://www.snakebite.net | @trentnelson
More information about the freebsd-fs
mailing list