Uneven load on drives in ZFS RAIDZ1
Peter Maloney
peter.maloney at brockmann-consult.de
Mon Dec 19 15:42:23 UTC 2011
On 12/19/2011 03:22 PM, Stefan Esser wrote:
> Hi ZFS users,
>
> for quite some time I have observed an uneven distribution of load
> between drives in a 4 * 2TB RAIDZ1 pool. The following is an excerpt of
> a longer log of 10 second averages logged with gstat:
>
> dT: 10.001s w: 10.000s filter: ^a?da?.$
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 0 130 106 4134 4.5 23 1033 5.2 48.8| ada0
> 0 131 111 3784 4.2 19 1007 4.0 47.6| ada1
> 0 90 66 2219 4.5 24 1031 5.1 31.7| ada2
> 1 81 58 2007 4.6 22 1023 2.3 28.1| ada3
>
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 1 132 104 4036 4.2 27 1129 5.3 45.2| ada0
> 0 129 103 3679 4.5 26 1115 6.8 47.6| ada1
> 1 91 61 2133 4.6 30 1129 1.9 29.6| ada2
> 0 81 56 1985 4.8 24 1102 6.0 29.4| ada3
>
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 1 148 108 4084 5.3 39 2511 7.2 55.5| ada0
> 1 141 104 3693 5.1 36 2505 10.4 54.4| ada1
> 1 102 62 2112 5.6 39 2508 5.5 35.4| ada2
> 0 99 60 2064 6.0 39 2483 3.7 36.1| ada3
>
> ...
> So: Can anybody reproduce this distribution requests?
I don't have a raidz1 machine, and no time to make you a special raidz1
pool out of spare disks, but on my raidz2 I can only ever see unevenness
when a disk is bad, or between different vdevs. But you only have one vdev.
Check is that your disks are identical (are they? we can only assume so
since you didn't say so).
Show us output from:
smartctl -i /dev/ada0
smartctl -i /dev/ada1
smartctl -i /dev/ada2
smartctl -i /dev/ada3
Since your tests show read ms/r to be pretty even, I guess your disks
are not broken. But the ms/w is slightly different. So I think it seems
that the first 2 disks are slower for writing (someone once said that
refurbished disks are like this, even if identical), or the hard disk
controller ports they use are slower. For example, maybe your
motherboard has 6 ports, and you plugged disks 1,2,3 into port 1,2,3 and
disk 4 into port 5. Disk 3 and 4 would have their own channel, but disk
1 and 2 share one.
So if the disks are identical, I would guess your hard disk controller
is to blame. To test this, first back it up. Then *fix your setup by
using labels*. ie. use gpt/somelabel0 or gptid/....... rather than
ada0p2. Check "ls /dev/gpt*" output for options on what labels you have
already. Then try swapping disks around to see if the load changes. Make
sure to back up...
Swapping disks (or even removing one depending on controller, etc. when
it fails) without labels can be bad.
eg.
You have ada1 ada2 ada3 ada4.
Someone spills coffee on ada2; it fries and cannot be detected anymore,
and you reboot.
Now you have ada1 ada2 ada3.
Then things are usually still fine (even though ada3 is now ada2 and
ada4 is now ada3, because there is some zfs superblock stuff to keep
track of things), but if you also had an ada5 that was not part of the
pool, or was a spare or a log or something other than another disk in
the same vdev as ada1, etc., bad things happen when it becomes ada4.
Unfortunately, I don't know exactly what people do to cause the "bad
things" that happen. When this happened to me, it just said my pool was
faulted or degraded or something, and set a disk or two to UNAVAIL or
FAULTED. I don't remember it automatically resilvering them, but when I
read about these problems, I think it seems like some disks were
resilvered afterwards.
And last thing I can think of is to make sure your partitions are
aligned, and identical. Show us output from:
gpart show
> Any idea, why this is happening and whether something should be changed
> in ZFS to better distribute the load (leading to higher file system
> performance)?
>
> Best regards, STefan
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
--
--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney at brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------
More information about the freebsd-current
mailing list