[Bug 282702] Single disk ZFS pool hangs if drive goes away

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 12 Nov 2024 00:03:36 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282702

            Bug ID: 282702
           Summary: Single disk ZFS pool hangs if drive goes away
           Product: Base System
           Version: 14.1-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: darius@dons.net.au

I have a ZFS system where I backup the main pool to a rotating set of disks
using zfs send/recv (via zrepl).
Normally the backup runs OK and my script does a 'zpool offline' on the backup
pool and I swap disks and it is fine. However if I manually do it and forget to
offline the pull when I pop the disk the zpool fails:
[cain 10:21] ~ >zpool status cain-backup-2
  pool: cain-backup-2
 state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
  scan: scrub repaired 0B in 04:57:56 with 0 errors on Tue Nov 12 08:36:25 2024
config:

        NAME                 STATE     READ WRITE CKSUM
        cain-backup-2        UNAVAIL      0     0     0  insufficient replicas
          gpt/cain-backup-2  REMOVED      0     0     0

errors: 4 data errors, use '-v' for a list

Which is fair enough, however if I put the disk back in no combination of zpool
clear, online or export will get it back into an operating condition, or make
it forget about the suspended pool.

eg "sudo zpool clear cain-backup-2" will complain "cannot clear errors for
cain-backup-2: I/O error". Adding -FnX results in no output but the pool is
still hung.

The disk is available at the same name (/dev/gpt/cain-backup-2) so I am not
sure why it can't just reopen the device and continue.

I have also tried 'sudo zpool online cain-backup-2 /dev/gpt/cain-backup-2' but
that doesn't work either.

-- 
You are receiving this mail because:
You are the assignee for the bug.