Re: Unable to replace drive in raidz1

From: Alan Somers <asomers_at_freebsd.org>
Date: Fri, 06 Sep 2024 22:02:31 UTC
On Fri, Sep 6, 2024 at 3:49 PM Chris Ross <cross+freebsd@distal.com> wrote:
>
>
>
> > On Sep 6, 2024, at 17:22, Wes Morgan <morganw@gmail.com> wrote:
> >
> > The labels are helpful for fstab, but zfs doesn't need fstab. In the early days of zfs on freebsd the unpartitioned device was recommended; maybe that's not accurate any longer, but I still follow it for a pool that contains vdevs with multiple devices (raidz).
> >
> > If you use, e.g., da0 in a pool, you cannot later replace it with a labeled device of the same size; it won't have enough sectors.
>
> The problem is shown here.  da3 was in a pool.  Then, when the system rebooted, da3 was the kernels name for a different device in a different pool.  Had I known then how to interact with the guid (status -g), I likely would’ve been fine.
>
> >> So, I offline’d the disk-to-be-replaced at 09:40 yesterday, then I shut the system down, removed that physical device replacing it with a larger disk, and rebooted.  I suspect the “offline”s after that are me experimenting when it was telling me it couldn’t start the replace action I was asking for.
> >
> > This is probably where you made your mistake. Rebooting shifted another device into da3. When you tried to offline it, you were probably either targeting a device in a different raidz or one that wasn't in the pool. The output of those original offline commands would have been informative. You could also check dmesg and map the serial numbers to device assignments to figure out what device moved to da3.
>
> I offline’d “da3” before I rebooted.  After rebooting, I tried the obvious and correct (i thought) “zpool replace da3 da10” only to get the error I’ve been getting since.  Again, had I known how to use the guid for the device that used to be da3 but now isn’t, that might’ve worked.  I can’t know now.
>
> Then, while trying to fix the problem, I likely made it worse by trying to interact with da3, which in the pools brain was a missing disk in raidz1-0, but the kernel also knew /dev/da3 to be a working disk (that happened to be one of the drives in raidz1-1).  I feel that zfs did something wrong somewhere if it _ever_ tried to talk to /dev/da3 when I said “da3” after I rebooted and it found that device to be part of raidz1-1, but.
>
>
> > Sounds about right. In another message it seemed like the pool had started an autoreplace. So I assume you have zfsd enabled? That is what issues the replace command. Strange that it is not anywhere in the pool history. There should be syslog entries for any actions it took.
>
> I don’t think so.  That message about some “already in replacing/spare config” came up before anything else.  At which point, I’d never had a spare in this pool, and there was no replace shown in zpool status.
>
> > In your case, it appears that you had two missing devices - the original "da3" that was physically removed, and the new da3 that you forced offline. You added da10 as a spare, when what you needed to do was a replace. Spare devices do not auto-replace without zfsd running and autoreplace set to on.
>
> I did offline “da3” a couple of times, again thinking I was working with what zpool showed as “da3”.  If it did anything with /dev/da3 there, then I think that may be a bug.  Or, at least, something that should be made more clear.  It _didn’t_ offline the diskid/DISK-K1GMBN9D from raidz1-1, which is what the kernel has at da3.  So.
>
> > This should all be reported in zpool status. In your original message, there is no sign of a replacement in progress or a spare device, assuming you didn't omit something. If the pool is only showing that a single device is missing, and that device is to be replaced by da10, zero out the first and last sectors (I think a zfs label is 128k?) to wipe out any labels and use the replace command, not spare, e.g. "zpool replace tank da3 da10", or use the missing guid as suggested elsewhere. This should work based on the information provided.
>
> I’ve never seen a replacement going on, and I have had the new disk “da10” as a spare a couple of times while testing.  But it wasn’t left there after I determined that that also didn’t let me get it replaced into the raidz.
>
> And, that attempt to replace is what I’ve tried many times, with multiple id’s.  I have cleared the label on da10 multiple times.  That replace doesn’t work, giving this error message in all cases.
>
>             - Chris
>
>
> % glabel status
>                                       Name  Status  Components
>             diskid/DISK-BTWL503503TW480QGN     N/A  ada0
>                                  gpt/l2arc     N/A  ada0p1
> gptid/9d00849e-0b82-11ec-a143-84b2612f2c38     N/A  ada0p1
>                       diskid/DISK-K1GMBN9D     N/A  da3
>                       diskid/DISK-3WJDHJ2J     N/A  da6
>                       diskid/DISK-3WK3G1KJ     N/A  da7
>                       diskid/DISK-3WJ7ZMMJ     N/A  da8
>                       diskid/DISK-K1GMEDMD     N/A  da4
>                       diskid/DISK-K1GMAX1D     N/A  da5
>                                ufs/drive12     N/A  da9
>                       diskid/DISK-ZGG0A2PA     N/A  da10
>
> % zpool status tank
>   pool: tank
>  state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.
> action: Replace the faulted device, or use 'zpool clear' to mark the device
>         repaired.
>   scan: scrub repaired 0B in 17:14:03 with 0 errors on Fri Sep  6 09:08:34 2024
> config:
>
>         NAME                      STATE     READ WRITE CKSUM
>         tank                      DEGRADED     0     0     0
>           raidz1-0                DEGRADED     0     0     0
>             da3                   FAULTED      0     0     0  external device fault
>             da1                   ONLINE       0     0     0
>             da2                   ONLINE       0     0     0
>           raidz1-1                ONLINE       0     0     0
>             diskid/DISK-K1GMBN9D  ONLINE       0     0     0
>             diskid/DISK-K1GMEDMD  ONLINE       0     0     0
>             diskid/DISK-K1GMAX1D  ONLINE       0     0     0
>           raidz1-2                ONLINE       0     0     0
>             diskid/DISK-3WJDHJ2J  ONLINE       0     0     0
>             diskid/DISK-3WK3G1KJ  ONLINE       0     0     0
>             diskid/DISK-3WJ7ZMMJ  ONLINE       0     0     0
>
> errors: No known data errors
>
> % sudo zpool replace tank da3 da10
> Password:
> cannot replace da3 with da10: already in replacing/spare config; wait for completion or use 'zpool detach'
>
> % zpool status -g tank
>   pool: tank
>  state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.
> action: Replace the faulted device, or use 'zpool clear' to mark the device
>         repaired.
>   scan: scrub repaired 0B in 17:14:03 with 0 errors on Fri Sep  6 09:08:34 2024
> config:
>
>         NAME                      STATE     READ WRITE CKSUM
>         tank                      DEGRADED     0     0     0
>           16506780107187041124    DEGRADED     0     0     0
>             9127016430593660128   FAULTED      0     0     0  external device fault
>             4094297345166589692   ONLINE       0     0     0
>             17850258180603290288  ONLINE       0     0     0
>           5104119975785735782     ONLINE       0     0     0
>             6752552549817423876   ONLINE       0     0     0
>             9072227575611698625   ONLINE       0     0     0
>             13778609510621402511  ONLINE       0     0     0
>           11410204456339324959    ONLINE       0     0     0
>             1083322824660576293   ONLINE       0     0     0
>             12505496659970146740  ONLINE       0     0     0
>             11847701970749615606  ONLINE       0     0     0
>
> errors: No known data errors
>
> % sudo zpool replace tank 9127016430593660128 da10
> cannot replace 9127016430593660128 with da10: already in replacing/spare config; wait for completion or use 'zpool detach'
>
> % sudo zpool replace tank 9127016430593660128 diskid/DISK-ZGG0A2PA
> cannot replace 9127016430593660128 with diskid/DISK-ZGG0A2PA: already in replacing/spare config; wait for completion or use 'zpool detach'

Another user reports the same error message.  In their case, it's an
inappropriate error message from /sbin/zpool.  Can you try a "zpool
status -v" and "diskinfo -f /dev/da10"?  That will show you if you
have the same problem.  If your pool has a 512B block size but the new
disk is 4kn, then you cannot use it as a replacement.

https://github.com/openzfs/zfs/issues/14730