Re: ZFS: Rescue FAULTED Pool

From: Dennis Clarke <dclarke_at_blastwave.org>
Date: Sat, 01 Feb 2025 14:10:25 UTC
>>
>> The most useful thing to share right now would be the output of `zpool
>> import` (with no pool name) on the rebooted system.
>>
>> That will show where the issues are, and suggest how they might be solved.
>>
> 
> Hello, this exactly happens when trying to import the pool. Prior to the loss, device da1p1
> has been faulted with numbers in the colum/columns "corrupted data"/further not seen now.
> 
> 
>   ~# zpool import
>     pool: BUNKER00
>       id: XXXXXXXXXXXXXXXXXXXX
>    state: FAULTED
> status: The pool metadata is corrupted.
>   action: The pool cannot be imported due to damaged devices or data.
>          The pool may be active on another system, but can be imported using
>          the '-f' flag.
>     see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
>   config:
> 
>          BUNKER00    FAULTED  corrupted data
>            raidz1-0  ONLINE
>              da2p1   ONLINE
>              da3p1   ONLINE
>              da4p1   ONLINE
>              da7p1   ONLINE
>              da6p1   ONLINE
>              da1p1   ONLINE
>              da5p1   ONLINE
> 
> 
>   ~# zpool import -f BUNKER00
> cannot import 'BUNKER00': I/O error
>          Destroy and re-create the pool from
>          a backup source.
> 
> 
> ~# zpool import -F BUNKER00
> cannot import 'BUNKER00': one or more devices is currently unavailable
> 

     This is indeed a sad situation. You have a raidz1 pool with one or
MORE devices that seem to have left the stage. I suspect more than one.

     I can only guess what you see from "camcontrol devlist" as well as
data from "gpart show -l" where we would see the partition data along
with and GPT labels. If in fact you used GPT scheme. You have a list of
devices that all say "p1" there and so I guess you made some sort of a
partition table. ZFS does not need that but it can be nice to have. In
any case, it really does look like you have _more_ than one failure in
there somewhere and only dmesg and some separate tests on each device
would reveal the truth.


--
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken