drive selection for disk arrays
Daniel Feenberg
feenberg at nber.org
Thu Mar 26 20:56:38 UTC 2020
The disturbing frequency of multiple drives going offline in quick
succession is, in my view, largely a result of defects being discovered in
quick succession, rather than occuring in quick succession. If a defect
occurs in a sector that is rarely visited it can remain hidden for a long
time. During a resilver that defect will be noticed and the drive failed
out. I do think that is an overly aggressive action by the resilvering
process, as that may be the only bad sector, it may be possible to recover
all the data from the remaining drives (if the first failing drive can
read the appropriate sector), and that sector may not even be in an active
file.
This issue makes scrubbing particularly important, especially in this era
of very large filesystems that can take days or weeks to restore.
More information about the freebsd-questions
mailing list