Re: "spare-X" device remains after resilvering
- Reply: John Doherty: "Re: "spare-X" device remains after resilvering"
- In reply to: John Doherty: ""spare-X" device remains after resilvering"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 20 Jun 2022 21:40:33 UTC
On Mon, Jun 20, 2022 at 7:42 AM John Doherty <bsdlists@jld3.net> wrote: > > Hi, I have a zpool that currently looks like this (some lines elided for > brevity; all omitted devices are online and apparently fine): > > pool: zp1 > state: DEGRADED > status: One or more devices has been taken offline by the administrator. > Sufficient replicas exist for the pool to continue functioning > in a > degraded state. > action: Online the device using 'zpool online' or replace the device > with > 'zpool replace'. > scan: resilvered 1.76T in 1 days 00:38:14 with 0 errors on Sun Jun 19 > 22:31:46 2022 > config: > > NAME STATE READ WRITE CKSUM > zp1 DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > gpt/disk0 ONLINE 0 0 0 > gpt/disk1 ONLINE 0 0 0 > ... > gpt/disk9 ONLINE 0 0 0 > raidz2-1 ONLINE 0 0 0 > gpt/disk10 ONLINE 0 0 0 > ... > gpt/disk19 ONLINE 0 0 0 > raidz2-2 ONLINE 0 0 0 > gpt/disk20 ONLINE 0 0 0 > ... > gpt/disk29 ONLINE 0 0 0 > raidz2-3 DEGRADED 0 0 0 > gpt/disk30 ONLINE 0 0 0 > 3343132967577870793 OFFLINE 0 0 0 was > /dev/gpt/disk31 > ... > spare-9 DEGRADED 0 0 0 > 6960108738988598438 OFFLINE 0 0 0 was > /dev/gpt/disk39 > gpt/disk41 ONLINE 0 0 0 > spares > 16713572025248921080 INUSE was /dev/gpt/disk41 > gpt/disk42 AVAIL > gpt/disk43 AVAIL > gpt/disk44 AVAIL > > My question is why the "spare-9" device still exists after the > resilvering completed. Based on past experience, my expectation was that > it would exist for the duration of the resilvering and after that, only > the "gpt/disk41" device would appear in the output of "zpool status." > > I also expected that when the resilvering completed, the "was > /dev/gpt/disk41" device would be removed from the list of spares. > > I took the "was /dev/gpt/disk31" device offline deliberately because it > was causing a lot of "CAM status: SCSI Status Error" errors. Next step > for this pool is to replace that with one of the available spares but > I'd like to get things looking a little cleaner before doing that. > > I don't have much in the way of ideas here. One thought was to export > the pool and then do "zpool import zp1 -d /dev/gpt" and see if that > cleaned things up. > > This system is running 12.2-RELEASE-p4, which I know is a little out of > date. I'm going to update it 13.1-RELEASE soon but the more immediate > need is to get this zpool in good shape. > > Any insights or advice much appreciated. Happy to provide any further > info that might be helpful. Thanks. This is expected behavior. I take it that you were expecting for 6960108738988598438 to be removed from the configuration, replaced by gpt/disk41, and for gpt/disk41 to disappear from the spare list? That didn't happen because ZFS considers anything in the spare list to be a permanent spare. It will never automatically remove a disk from the spare list. Instead, zfs is expecting for you to provide it with a permanent replacement for the failed disk. Once resilvering to the permanent replacement is complete, then it will automatically detach the spare. OTOH, if you really want gpt/disk41 to be the permanent replacement, I think you can accomplish that with some combination of the following commands: zpool detach zp1 6960108738988598438 zpool remove zp1 gpt/disk41 -Alan