drive selection for disk arrays

Fri Mar 27 16:54:09 UTC 2020

On 3/27/20 5:45 AM, Polytropon wrote:

> When a drive _reports_ bad sectors, at least in the past
> it was an indication that it already _has_ lots of them.
> The drive's firmware will remap bad sectors to spare
> sectors, so "no error" so far. When errors are being
> reported "upwards" ("read error" or "write error"
> visible to the OS), it's a sign that the disk has run
> out of spare sectors, and the firmware cannot silently
> remap _new_ bad sectors...
> 
> Is this still the case with modern drives?

Yes.  And this ties in with the distinction that was made between when 
an error occurs and when it is noticed or reported.
> How transparently can ZFS handle drive errors when the
> drives only report the "top results" (i. e., cannot cope
> with bad sectors internally anymore)? Do SMART tools help
> here, for example, by reading certain firmware-provided
> values that indicate how many sectors _actually_ have
> been marked as "bad sector", remapped internally, and
> _not_ reported to the controller / disk I/O subsystem /
> filesystem yet? This should be a good indicator of "will
> fail soon", so a replacement can be done while no data
> loss or other problems appears.

Smartmontools is definitely a help with this.  The periodic task

/usr/local/etc/periodic/daily/smart

is exactly for staying on top of this.

If drives are monitored more effectively it makes it more unlikely that 
you will suffer data loss.  Perhaps multiple drives that failed over a 
short period of time and caused data loss were drives that encountered 
recoverable errors months, possibly years, before the un-recoverable 
errors occurred.  But those recoverable errors were not handled as well 
as they could have been by firmware or software.

The handling of disk errors is an inherently complicated topic and there 
is not much time available to discuss it.  One thing to keep in mind is 
the behavior in systems with generic RAID controllers / HBAs and disks 
is substantially different from those in systems with proprietary 
controllers / HBAs and disks.

The value of proprietary hardware can be debated but when it comes to 
server class systems, Dell, HP, IBM, Lenovo, etc. and the suppliers they 
use, go to great lengths in their controller / HBA / disk firmware 
design to be careful about avoiding failure scenarios that can cause 
data loss.  The SMART technology does allow drives to keep track of 
various types of errors and to notify hosts before data loss.  The 
proprietary controllers build on top of this with their own design. 
Their PF and PFA designations are part of this.

When people have seen multiple disk failures over a short time period 
was that with generic hardware or proprietary?  This has to be 
considered in order to properly understand the meaning.  In my opinion 
the relative importance and likelihood a multiple disk failure is 
different for a generic hardware user compared one using proprietary 
hardware.  No method is perfect but there are differences.  SATA vs SAS 
is also an aspect too.

Diversity is also a point for SSDs too.

Bulletin: HPE SAS Solid State Drives - Critical Firmware Upgrade 
Required for Certain HPE SAS Solid State Drive Models to Prevent Drive 
Failure at 40,000 Hours of Operation

which doesn't seem to be specific to HPE.

-
John J.