Re: Zpool status -- why does a suboptimal pool show as "ONLINE"?
- In reply to: Dave Cottlehuber: "Re: Zpool status -- why does a suboptimal pool show as "ONLINE"?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 17 Sep 2024 11:16:20 UTC
On 2024-09-12 14:29, Dave Cottlehuber wrote: > On Thu, 12 Sep 2024, at 13:05, Dan Mahoney (Ports) wrote: >> Hey there all, >> >> I have a nagios check that assumes that if I have a suboptimal zfs >> zpool, that the word “DEGRADED” will appear in the output. One disk >> of >> a two-disk mirror seems to have faulted, but the pool still shows as >> “ONLINE”. I know I’ve seen the word “DEGRADED” in the past. What’s >> different? >> >> pool: zroot >> state: ONLINE >> status: One or more devices are faulted in response to persistent >> errors. >> Sufficient replicas exist for the pool to continue functioning >> in a >> degraded state. >> action: Replace the faulted device, or use 'zpool clear' to mark the >> device >> repaired. >> config: >> >> NAME STATE READ WRITE CKSUM >> zroot ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> ada0p3 FAULTED 4 372 0 too many errors >> ada1p3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> 14.1, if it matters, the disks are two innolite SATADOM’s. > > Hi Dan > > I agree that I would expect the mirror-0 at least to report DEGRADED > or similar. Hopefully one of the zfs people clarifies the logic here. > > Practically, what I do is run: > > zpool status | grep -v 'with 0 errors' | sha256 > > and check that this hash remains the same over time. It's obviously > different for each pool. Could that help for nagios? I agree. A faulted drive always used to appear as "FAULTED" and and the vdev and pool should both have been tagged "DEGRADED" (cascading upwards). A faulted drive isn't necessary taken offline, although "too many errors" suggests it should be. If this isn't a bug I'd like to know the reason why. Regards, Frank.