Re: ZFS replace a mirrored disk

From: David Christensen <dpchrist_at_holgerdanske.com>
Date: Thu, 12 May 2022 22:20:04 UTC
On 5/12/22 05:26, Christos Chatzaras wrote:
> Yes this partition layout created by bsdinstall.


Okay.


> I continue with my tests:
> 
> Instead of `dd zero` all the disk, I boot on mfsBSD and clear metadata on both swap and zpool:
> 
> $> gmirror clear nvd0p2
> $> zpool labelclear -f nvd0
> $> gpart destroy -F nvd0
> 
> Then I boot on main OS and I run:
> 
> $> gpart backup nvd1 | gpart restore -F nvd0
> $> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
> 
> 
> $> gmirror status
>         Name    Status  Components
> mirror/swap  DEGRADED  nvd1p2 (ACTIVE)
> 
> So the gmirror status is as expected, and I have to forget / insert:
> 
> $> gmirror forget swap
> $> gmirror insert swap /dev/nvd0p2
> 
> And the swap mirror is ready:
> 
> $> gmirror status
>         Name    Status  Components
> mirror/swap  COMPLETE  nvd1p2 (ACTIVE)
>                         nvd0p2 (ACTIVE)


Okay.  It looks like gmirror does not remember devices, only partitions.


> Now let's see the zpool:
> 
> $> zpool status
>    pool: zroot
>   state: DEGRADED
> status: One or more devices could not be opened.  Sufficient replicas exist for
>          the pool to continue functioning in a degraded state.
> action: Attach the missing device and online it using 'zpool online'.
>     see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
> config:
> 
>          NAME        STATE     READ WRITE CKSUM
>          zroot       DEGRADED     0     0     0
>            mirror-0  DEGRADED     0     0     0
>              nvd0p3  UNAVAIL      0     0     0  cannot open
>              nvd1p3  ONLINE       0     0     0
> 
> errors: No known data errors
> 
> I was expecting to asked to run "zpool replace" and not "zpool online", but it looks like it thinks it's a temporary error.
> 
> $> zpool online zroot nvd0p3
> 
> $> zpool status
>    pool: zroot
>   state: ONLINE
>    scan: resilvered 8.93M in 00:00:00 with 0 errors on Thu May 12 13:08:12 2022
> config:
> 
>          NAME        STATE     READ WRITE CKSUM
>          zroot       ONLINE       0     0     0
>            mirror-0  ONLINE       0     0     0
>              nvd0p3  ONLINE       0     0     0
>              nvd1p3  ONLINE       0     0     0
> 
> I guess because I didn't `dd zero` resilvering fixes only the blocks that are changed, so not many MB.


Oaky.  I believe you simulated recovering a corrupt ZFS device label on 
a device.


> I also tried "dd if=/dev/zero of=/dev/nvd0 bs=1M" but when I boot on main OS it still show to use "zpool online" instead of "zpool replace" :
> 
> $> zpool status
>    pool: zroot
>   state: DEGRADED
> status: One or more devices could not be opened.  Sufficient replicas exist for
>          the pool to continue functioning in a degraded state.
> action: Attach the missing device and online it using 'zpool online'.
>     see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
>    scan: resilvered 9.14M in 00:00:00 with 0 errors on Thu May 12 14:31:40 2022
> config:
> 
>          NAME                      STATE     READ WRITE CKSUM
>          zroot                     DEGRADED     0     0     0
>            mirror-0                DEGRADED     0     0     0
>              11739263350767631641  UNAVAIL      0     0     0  was /dev/nvd0p3
>              nvd1p3                ONLINE       0     0     0
> 
> errors: No known data errors


Okay.  It looks like you simulated recovering a device with corrupt ZFS 
contents.


I think you need a third device to simulate "When a disk fails and want 
to replace it with a NEW disk".


David