Re: Zpool status -- why does a suboptimal pool show as "ONLINE"?

From: Dave Cottlehuber <dch_at_skunkwerks.at>
Date: Thu, 12 Sep 2024 13:29:28 UTC
On Thu, 12 Sep 2024, at 13:05, Dan Mahoney (Ports) wrote:
> Hey there all,
>
> I have a nagios check that assumes that if I have a suboptimal zfs 
> zpool, that the word “DEGRADED” will appear in the output.  One disk of 
> a two-disk mirror seems to have faulted, but the pool still shows as 
> “ONLINE”.  I know I’ve seen the word “DEGRADED” in the past.  What’s 
> different?
>
>   pool: zroot
>  state: ONLINE
> status: One or more devices are faulted in response to persistent errors.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.
> action: Replace the faulted device, or use 'zpool clear' to mark the device
>         repaired.
> config:
>
>         NAME        STATE     READ WRITE CKSUM
>         zroot       ONLINE       0     0     0
>           mirror-0  ONLINE       0     0     0
>             ada0p3  FAULTED      4   372     0  too many errors
>             ada1p3  ONLINE       0     0     0
>
> errors: No known data errors
>
> 14.1, if it matters, the disks are two innolite SATADOM’s.

Hi Dan

I agree that I would expect the mirror-0 at least to report DEGRADED
or similar. Hopefully one of the zfs people clarifies the logic here.

Practically, what I do is run:

    zpool status | grep -v 'with 0 errors' | sha256

and check that this hash remains the same over time. It's obviously
different for each pool. Could that help for nagios?

A+
Dave