Re: ZFS: Rescue FAULTED Pool
- Reply: Dennis Clarke : "Re: ZFS: Rescue FAULTED Pool"
- In reply to: Allan Jude : "Re: ZFS: Rescue FAULTED Pool"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 01 Feb 2025 08:57:15 UTC
Am Thu, 30 Jan 2025 16:13:56 -0500 Allan Jude <allanjude@freebsd.org> schrieb: > On 1/30/2025 6:35 AM, A FreeBSD User wrote: > > Am Wed, 29 Jan 2025 03:45:25 -0800 > > David Wolfskill <david@catwhisker.org> schrieb: > > > > Hello, thanks for responding. > > > >> On Wed, Jan 29, 2025 at 11:27:01AM +0100, FreeBSD User wrote: > >>> Hello, > >>> > >>> a ZFS pool (RAINDZ(1)) has been faulted. The pool is not importable > >>> anymore. neither with import -F/-f. > >>> Although this pool is on an experimental system (no backup available) > >>> it contains some data to reconstruct them would take a while, so I'd > >>> like to ask whether there is a way to try to "de-fault" such a pool. > >> > >> Well, 'zpool clear ...' "Clears device errors in a pool." (from "man > >> zpool". > >> > >> It is, however, not magic -- it doesn't actually fix anything. > > > > For the record: I tried EVERY network/search available method useful for common > > "administrators", but hoped people are abe to manipulate deeper stuff via zdb ... > > > >> > >> (I had an issue with a zpool which had a single SSD device as a ZIL; the > >> ZIL device failed after it had accepted some data to be written to the > >> pool, but before the data could be read and transferred to the spinning > >> disks. ZFS was quite unhappy about that. I was eventually able to copy > >> the data elsewhere, destroy the old zpool, recreate it *without* that > >> single point of failure, then copy the data back. And I learned to > >> never create a zpool with a *single* device as a separate ZIL.) > > > > Well, in this case I do not use dedicated ZIL drives. I also made several experiences with > > "single" ZIL drive setups, but a dedicated ZIL is mostly useful in cases were you have > > graveyard full of inertia-suffering, mass-spinning HDDs - if I'm right the concept of SSD > > based ZIL would be of no use/effect in that case. So I ommited tose. > > > >> > >>> The pool is comprised from 7 drives as a RAIDZ1, one of the SSDs > >>> faulted but I pulled the wrong one, so the pool ran into suspended > >>> state. > >> > >> Can you put the drive you pulled back in? > > > > Every single SSD originally plugged in is now back in place, even the faulted one (which > > doesn't report any faults at the moment). > > > > Although the pool isn't "importable", zdb reports its existence, amongst zroot (which > > resides on a dedicated drive). > > > >> > >>> The host is running the lates Xigmanas BETA, which is effectively > >>> FreeBSD 14.1-p2, just for the record. > >>> > >>> I do not want to give up, since I hoped there might be a rude but > >>> effective way to restore the pool even under datalosses ... > >>> > >>> Thanks in advance, > >>> > >>> Oliver > >>> .... > >> > >> Good luck! > >> > >> Peace, > >> david > > > > > > Well, this is a hard and painful lecture to learn, if there is no chance to get back the > > pool. > > > > A warning (but this seems to be useless in the realm of professionals): I used a bunch of > > cheap spotmarket SATA SSDs, a brand called "Intenso" common also here in Good old Germany. > > Some of those SSDs do have working LED when used with a Fujitsu SAS HBA controller - but > > those died very quickly from suffering some bus errors. Another bunch of those SSDs do not > > have working LED (not blinking on access), but lasted a bit longer. The problem with those > > SSDs is: I can not find the failing device easily by accessing the failed drive by writing > > massive data via dd, if possible. > > I also ordered alternative SSDs from a more expensive brand - but bad Karma ... > > > > Oliver > > > > > > The most useful thing to share right now would be the output of `zpool > import` (with no pool name) on the rebooted system. > > That will show where the issues are, and suggest how they might be solved. > Hello, this exactly happens when trying to import the pool. Prior to the loss, device da1p1 has been faulted with numbers in the colum/columns "corrupted data"/further not seen now. ~# zpool import pool: BUNKER00 id: XXXXXXXXXXXXXXXXXXXX state: FAULTED status: The pool metadata is corrupted. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72 config: BUNKER00 FAULTED corrupted data raidz1-0 ONLINE da2p1 ONLINE da3p1 ONLINE da4p1 ONLINE da7p1 ONLINE da6p1 ONLINE da1p1 ONLINE da5p1 ONLINE ~# zpool import -f BUNKER00 cannot import 'BUNKER00': I/O error Destroy and re-create the pool from a backup source. ~# zpool import -F BUNKER00 cannot import 'BUNKER00': one or more devices is currently unavailable -- A FreeBSD user