ZFS v28 on -STABLE not using hot spare
Johan Hendriks
joh.hendriks at gmail.com
Tue Jan 3 15:17:18 UTC 2012
Matt Burke schreef:
> Over the holidays one of the disks on a server has failed, but despite
> configuring a hot spare, ZFS hasn't used it for some reason. Can anyone
> shed some light on what I might have mis-configured to break the hot-spare
> functionality?
>
>
> [root at x ~]# uname -a
> FreeBSD x 8.2-STABLE FreeBSD 8.2-STABLE #4: Mon Dec 5 12:43:58 GMT 2011
> root at x:/usr/obj/usr/src/sys/x amd64
>
>
> [root at x ~]# more /usr/src/sys/amd64/conf/x
> include GENERIC
> ident x
>
> options GEOM_STRIPE
> options ROUTETABLES=4
>
>
> [root at x ~]# zpool status -v
> pool: data
> state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
> Sufficient replicas exist for the pool to continue functioning in a
> degraded state.
> action: Replace the faulted device, or use 'zpool clear' to mark the device
> repaired.
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> data DEGRADED 0 0 0
> mirror-0 ONLINE 0 0 0
> mfid0 ONLINE 0 0 0
> mfid14 ONLINE 0 0 0
> mirror-1 ONLINE 0 0 0
> mfid1 ONLINE 0 0 0
> mfid15 ONLINE 0 0 0
> mirror-2 DEGRADED 0 0 0
> mfid2 ONLINE 0 0 0
> mfid16 FAULTED 0 931 0 too many errors
> mirror-3 ONLINE 0 0 0
> mfid3 ONLINE 0 0 0
> mfid17 ONLINE 0 0 0
> mirror-4 ONLINE 0 0 0
> mfid4 ONLINE 0 0 0
> mfid18 ONLINE 0 0 0
> mirror-5 ONLINE 0 0 0
> mfid5 ONLINE 0 0 0
> mfid19 ONLINE 0 0 0
> mirror-6 ONLINE 0 0 0
> mfid6 ONLINE 0 0 0
> mfid20 ONLINE 0 0 0
> mirror-7 ONLINE 0 0 0
> mfid7 ONLINE 0 0 0
> mfid21 ONLINE 0 0 0
> mirror-8 ONLINE 0 0 0
> mfid8 ONLINE 0 0 0
> mfid22 ONLINE 0 0 0
> mirror-9 ONLINE 0 0 0
> mfid9 ONLINE 0 0 0
> mfid23 ONLINE 0 0 0
> mirror-10 ONLINE 0 0 0
> mfid10 ONLINE 0 0 0
> mfid24 ONLINE 0 0 0
> logs
> mirror-11 ONLINE 0 0 0
> mfid13 ONLINE 0 0 0
> mfid26 ONLINE 0 0 0
> cache
> mfid12 ONLINE 0 0 0
> mfid25 ONLINE 0 0 0
> spares
> mfid11 AVAIL
>
> errors: No known data errors
>
> The logs show loads of mfi1 and mfid16 errors for a few minutes, and then
> (presumably when ZFS dropped the disk) nothing relevant after that. ZFS
> hasn't logged anything, not even that it's failed a disk.
>
> I've manually done a 'zpool replace data mfid16 mfid11' which has brought
> the spare in without problems, but I'm eager to learn what I did (or didn't
> do?) to cause the spare to not be used automatically.
>
> Thanks in advance,
>
>
ZFS on FreeBSD does not have 'HOT' spares.
They are cold, and human intervention is needed to replace a disk in a pool.
There are some topics about it on the net.
I opt for a warning, because a lot of users get a false security sence
when using the spares.
zpool should not accept the spare without a warning to the user that it
is a cold spare and not a hot one.
it looks like there is some work planned for a zfs deamon that should
overcome this problem on FreeBSD
http://svnweb.freebsd.org/base?view=revision&revision=222836
On Solaris there is also a deamon running that does the actual replace.
It should not be to hard to make a script that checks every minute or
what time interval you want and check if a pool is degraded, then check
if autoreplace is set for the pool, if so check if there is a spare, if
so do the actual replace.
Unfortunally i can not code :(
Maybe some one has a script lying around. ??
regards
Johan Hendriks
More information about the freebsd-fs
mailing list