Re: ZFS replace a mirrored disk
- Reply: David Christensen : "Re: ZFS replace a mirrored disk"
- In reply to: Julien Cigar : "Re: ZFS replace a mirrored disk"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 11 May 2022 22:18:03 UTC
> First please define "without success", what doesn't work? > > please paste output of: > > $> gpart show nvd1 > > also, is it an UEFI system or classicla BIOS with GPT? What FreeBSD > version? > > zpool replace zroot nvd0 is invalid, you should use: > > $> zpool replace zroot nvd1 nvd0 (but it uses the entire disk, which is > probably incorrect too) It's legacy BIOS with GPT. What I want to do is "simulate" a disk failure and rebuild the RAID-1. First I run these commands from the main OS: ------------------------ $> gpart show => 40 7501476448 nvd0 GPT (3.5T) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 33554432 2 freebsd-swap (16G) 33556480 7467919360 3 freebsd-zfs (3.5T) 7501475840 648 - free - (324K) => 40 7501476448 nvd1 GPT (3.5T) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 33554432 2 freebsd-swap (16G) 33556480 7467919360 3 freebsd-zfs (3.5T) 7501475840 648 - free - (324K) $> zpool status pool: zroot state: ONLINE config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p3 ONLINE 0 0 0 nvd1p3 ONLINE 0 0 0 errors: No known data errors ------------------------------ Then I boot with mfsBSD and run this command to "simulate" a disk failure: $> gpart destroy -F nvd0 ------------------------------ Then I boot again in main OS and I run these commands: $> zpool status pool: zroot state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 nvd0 UNAVAIL 0 0 0 invalid label nvd1p3 ONLINE 0 0 0 errors: No known data errors $> gmirror status Name Status Components mirror/swap DEGRADED nvd1p2 (ACTIVE) ------------------------------ Then I backup / restore the partitions: $> gpart backup nvd1 | gpart restore -F nvd0 $> gpart show => 40 7501476448 nvd1 GPT (3.5T) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 33554432 2 freebsd-swap (16G) 33556480 7467919360 3 freebsd-zfs (3.5T) 7501475840 648 - free - (324K) => 40 7501476448 nvd0 GPT (3.5T) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 33554432 2 freebsd-swap (16G) 33556480 7467919360 3 freebsd-zfs (3.5T) 7501475840 648 - free - (324K) ------------------------------ Without doing a "gmirror forget swap" and "gmirror insert swap /dev/nvd0p2" I see that swap is already mirrored: $> gmirror status Name Status Components mirror/swap COMPLETE nvd1p2 (ACTIVE) nvd0p2 (ACTIVE) So first question is if the swap is mirrored automatically because nvd0 is the same disk (not replaced by a new disk). ------------------------------- Then I write the bootloader: $> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0 -------------------------------- Then I want to add this disk to zpool but these commands don't do it: $> zpool replace zroot nvd0 invalid vdev specification use '-f' to override the following errors: /dev/nvd0 is part of active pool 'zroot' $> zpool replace -f zroot nvd0 invalid vdev specification the following errors must be manually repaired: /dev/nvd0 is part of active pool 'zroot' ----------------------------------- Also these commands don't work: $> zpool replace zroot nvd1 nvd0 invalid vdev specification use '-f' to override the following errors: /dev/nvd0 is part of active pool 'zroot' $> zpool replace -f zroot nvd1 nvd0 invalid vdev specification the following errors must be manually repaired: /dev/nvd0 is part of active pool 'zroot' ----------------------------------- Instead these commands work: $> zpool offline zroot nvd0 zpool status pool: zroot state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 nvd0 OFFLINE 0 0 0 nvd1p3 ONLINE 0 0 0 errors: No known data errors $> zpool online zroot nvd0 $> zpool status pool: zroot state: ONLINE scan: resilvered 5.55M in 00:00:00 with 0 errors on Thu May 12 00:22:13 2022 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p3 ONLINE 0 0 0 nvd1p3 ONLINE 0 0 0 errors: No known data errors ------------------------------------ The second question is if instead of "zpool replace zroot nvd0" I had to use "zpool offline zroot nvd0" and "zpool online zroot nvd0" because nvd0 is the same disk (not replaced by a new disk). Also I notice that if I don't do "zpool offline zroot nvd0" and "zpool online zroot nvd0" , but do a server reboot instead then zpool automatically puts nvd0 online: $> zpool status pool: zroot state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P scan: resilvered 3.50M in 00:00:00 with 0 errors on Thu May 12 01:04:09 2022 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p3 ONLINE 0 0 2 nvd1p3 ONLINE 0 0 0 errors: No known data errors $> zpool clear zroot $> zpool status pool: zroot state: ONLINE scan: resilvered 3.50M in 00:00:00 with 0 errors on Thu May 12 01:04:09 2022 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p3 ONLINE 0 0 0 nvd1p3 ONLINE 0 0 0 errors: No known data errors