zfs, raidz, spare and jbod
Jeremy Chadwick
koitsu at FreeBSD.org
Fri Jul 25 09:45:16 UTC 2008
On Fri, Jul 25, 2008 at 09:46:34AM +0200, Claus Guttesen wrote:
> Hi.
>
> I installed FreeBSD 7 a few days ago and upgraded to the latest stable
> release using GENERIC kernel. I also added these entries to
> /boot/loader.conf:
>
> vm.kmem_size="1536M"
> vm.kmem_size_max="1536M"
> vfs.zfs.prefetch_disable=1
>
> Initially prefetch was enabled and I would experience hangs but after
> disabling prefetch copying large amounts of data would go along
> without problems. To see if FreeBSD 8 (current) had better (copy)
> performance I upgraded to current as of yesterday. After upgrading and
> rebooting the server responded fine.
With regards to RELENG_7, I completely agree with disabling prefetch.
The overall performance (of the system and disk I/O) appears signicantly
"smoother", e.g. less hard lock-ups and stalls, is better when prefetch
is disabled.
I have not tried CURRENT. I'm told the ZFS code in CURRENT is the same
as RELENG_7, so I'm not sure what you were trying to test by switching
from RELENG_7 to CURRENT.
> The server is a supermicro with a quad-core harpertown e5405 with two
> internal sata-drives and 8 GB of ram. I installed an areca arc-1680
> sas-controller and configured it in jbod-mode. I attached an external
> sas-cabinet with 16 sas-disks at 1 TB (931 binary GB).
>
> I created a raidz2 pool with 10 disks and added one spare. I copied
> approx. 1 TB of small files (each approx. 1 MB) and during the copy I
> simulated a disk-crash by pulling one of the disks out of the cabinet.
> Zfs did not activate the spare and the copying stopped until I
> rebooted after 5-10 minutes. When I performed a 'zpool status' the
> command would not complete. I did not see any messages in
> /var/log/message. State in top showed 'ufs-'.
>
> A similar test on solaris express developer edition b79 activated the
> spare after zfs tried to write to the missing disk enough times and
> then marked it as faulted. Has any one else tried to simulate a
> disk-crash in raidz(2) and succeeded?
Is there any way to confirm the behaviour is specific to raidz2, or
would it affect raidz1 as well? I have a raidz1 pool at home (3 disks
though; pulling one will probably result in bad things) which I can
pull a disk from, though it's off of an ICHx controller.
I have no experience with Areca controllers or their driver, but I do
have experience with standard onboard Intel ICHx chips. WRT those
chips, "pulling disks" without administratively downing the ATA channel
will cause a kernel panic. If the Areca controller/driver handles
things better, great.
I'm trying to say that I can offer to help with raidz1, but not on Areca
controllers. The hardware is similar to yours; Supermicro PDSMi+, Intel
E6600 (C2D), 4GB RAM, running RELENG_7 amd64. System contains 4 disks,
ad6,8,10 are in a ZFS pool, ad4 is the OS disk:
ad4: 190782MB <WDC WD2000JD-00HBB0 08.02D08> at ata2-master SATA150
ad6: 476940MB <WDC WD5000AAKS-00YGA0 12.01C02> at ata3-master SATA300
ad8: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata4-master SATA300
ad10: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata5-master SATA300
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad6 ONLINE 0 0 0
ad8 ONLINE 0 0 0
ad10 ONLINE 0 0 0
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list