Walter Cramer
wfc at mintsol.com
Wed May 8 16:32:01 UTC 2019
On Wed, 8 May 2019, Paul Mather wrote:
> On May 8, 2019, at 9:59 AM, Michelle Sullivan <michelle at sorbs.net> wrote:
>> Paul Mather wrote:
>>>> due to lack of space. Interestingly have had another drive die in the
>>>> array - and it doesn't just have one or two sectors down it has a *lot* -
>>>> which was not noticed by the original machine - I moved the drive to a
>>>> byte copier which is where it's reporting 100's of sectors damaged...
>>>> could this be compounded by zfs/mfi driver/hba not picking up errors like
>>>> it should?
>>> Did you have regular pool scrubs enabled? It would have picked up silent
>>> data corruption like this. It does for me.
>> Yes, every month (once a month because, (1) the data doesn't change much
>> (new data is added, old it not touched), and (2) because to complete it
>> took 2 weeks.)
> Do you also run sysutils/smartmontools to monitor S.M.A.R.T. attributes?
> Although imperfect, it can sometimes signal trouble brewing with a drive
> (e.g., increasing Reallocated_Sector_Ct and Current_Pending_Sector counts)
> that can lead to proactive remediation before catastrophe strikes.
> Unless you have been gathering periodic drive metrics, you have no way of
> knowing whether these hundreds of bad sectors have happened suddenly or
> slowly over a period of time.
Use `smartctl` from a cron script to do regular (say, weekly) *long*
self-tests of hard drives, and also log (say, daily) all the SMART
information from each drive. Then if a drive fails, you can at least
check the logs for whether SMART noticed symptoms, and (if so) for other
drives with symptoms. Or enhance this with a slightly longer script,
which watches the logs for symptoms, and alerts you.
(My experience is that SMART's *long* self-test checks the entire disk for
read errors, without neither downside of `zpool scrub` - it does a fast,
sequential read of the HD, including free space. That makes it a nice
test for failing disk hardware; not a replacement for `zpool scrub`.)
> Cheers,
> Paul.
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
More information about the freebsd-stable
mailing list