From nobody Thu Jan 30 21:13:56 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YkWwq6vG1z5mlJ9 for ; Thu, 30 Jan 2025 21:14:03 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from tor1-11.mx.scaleengine.net (tor1-11.mx.scaleengine.net [209.51.186.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4YkWwq2TzJz3NQl for ; Thu, 30 Jan 2025 21:14:03 +0000 (UTC) (envelope-from allanjude@freebsd.org) Authentication-Results: mx1.freebsd.org; dkim=none; spf=softfail (mx1.freebsd.org: 209.51.186.6 is neither permitted nor denied by domain of allanjude@freebsd.org) smtp.mailfrom=allanjude@freebsd.org; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=freebsd.org (policy=none) Received: from [10.1.1.3] (senat1-01.HML3.ScaleEngine.net [209.51.186.5]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by tor1-11.mx.scaleengine.net (Postfix) with ESMTPSA id 882281D5F9 for ; Thu, 30 Jan 2025 21:13:57 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.10.3 tor1-11.mx.scaleengine.net 882281D5F9 Message-ID: <980401eb-f8f6-44c7-8ee1-5ff0c9e1c35c@freebsd.org> Date: Thu, 30 Jan 2025 16:13:56 -0500 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: ZFS: Rescue FAULTED Pool To: freebsd-current@freebsd.org References: <20250129112701.0c4a3236@freyja> <20250130123354.2d767c7c@thor.sb211.local> Content-Language: en-US From: Allan Jude Autocrypt: addr=allanjude@freebsd.org; keydata= xsFNBFVwZcYBEADwrZDH0xe0ZVjc9ORCc6PcBLwS/RTXA6NkvpD6ea02pZ8lPOVgteuuugFc D34LdDbiWr+479vfrKBh+Y38GL0oZ0/13j10tIlDMHSa5BU0y6ACtnhupFvVlQ57+XaJAb/q 7qkfSiuxVwQ3FY3PL3cl1RrIP5eGHLA9hu4eVbu+FOX/q/XVKz49HaeIaxzo2Q54572VzIo6 C28McX9m65UL5fXMUGJDDLCItLmehZlHsQQ+uBxvODLFpVV2lUgDR/0rDa0B9zHZX8jY8qQ7 ZdCSy7CwClXI054CkXZCaBzgxYh/CotdI8ezmaw7NLs5vWNTxaDEFXaFMQtMVhvqQBpHkfOD 7rjjOmFw00nJL4FuPE5Yut0CPyx8vLjVmNJSt/Y8WxxmhutsqJYFgYfWl/vaWkrFLur/Zcmz IklwLw35HLsCZytCN5A3rGKdRbQjD6QPXOTJu0JPrJF6t2xFkWAT7oxnSV0ELhl2g+JfMMz2 Z1PDmS3NRnyEdqEm7NoRGXJJ7bgxDbN+9SXTyOletqGNXj/bSrBvhvZ0RQrzdHAPwQUfVSU2 qBhQEi2apSZstgVNMan0GUPqCdbE2zpysg+zT7Yhvf9EUQbzPL4LpdK1llT9fZbrdMzEXvEF oSvwJFdV3sqKmZc7b+E3PuxK6GTsKqaukd/3Cj8aLHG1T1im1QARAQABzSJBbGxhbiBKdWRl IDxhbGxhbmp1ZGVAZnJlZWJzZC5vcmc+wsF/BBMBAgApBQJVcGXGAhsjBQkSzAMABwsJCAcD AgEGFQgCCQoLBBYCAwECHgECF4AACgkQGZU1PhKYC34Muw/+JOKpSfhhysWFYiRXynGRDe07 Z6pVsn7DzrPUMRNZfHu8Uujmmy3p2nx9FelIY9yjd2UKHhug+whM54MiIFs90eCRVa4XEsPR 4FFAm0DAWrrb7qhZFcE/GhHdRWpZ341WAElWf6Puj2devtRjfYbikvj5+1V1QmDbju7cEw5D mEET44pTuD2VMRJpu2yZZzkM0i+wKFuPxlhqreufA1VNkZXI/rIfkYWK+nkXd9Efw3YdCyCQ zUgTUCb88ttSqcyhik/li1CDbXBpkzDCKI6I/8fAb7jjOC9LAtrZJrdgONywcVFoyK9ZN7EN AVA+xvYCmuYhR/3zHWH1g4hAm1v1+gIsufhajhfo8/wY1SetlzPaYkSkVQLqD8T6zZyhf+AN bC7ci44UsiKGAplB3phAXrtSPUEqM86kbnHg3fSx37kWKUiYNOnx4AC2VXvEiKsOBlpyt3dw WQbOtOYM+vkfbBwDtoGOOPYAKxc4LOIt9r+J8aD+gTooi9Eo5tvphATf9WkCpl9+aaGbSixB tUpvQMRnSMqTqq4Z7DeiG6VMRQIjsXDSLJEUqcfhnLFo0Ko/RiaHd5xyAQ4DhQ9QpkyQjjNf /3f/dYG7JAtoD30txaQ5V8uHrz210/77DRRX+HJjEj6xCxWUGvQgvEZf5XXyxeePvqZ+zQyT DX61bYw6w6bOwU0EVXBlxgEQAMy7YVnCCLN4oAOBVLZ5nUbVPvpUhsdA94/0/P+uqCIh28Cz ar56OCX0X19N/nAWecxL4H32zFbIRyDB2V/MEh4p9Qvyu/j4i1r3Ex5GhOT2hnit43Ng46z5 29Es4TijrHJP4/l/rB2VOqMKBS7Cq8zk1cWqaI9XZ59imxDNjtLLPPM+zQ1yE3OAMb475QwN UgWxTMw8rkA7CEaqeIn4sqpTSD5C7kT1Bh26+rbgJDZ77D6Uv1LaCZZOaW52okW3bFbdozV8 yM2u+xz2Qs8bHz67p+s+BlygryiOyYytpkiK6Iy4N7FTolyj5EIwCuqzfk0SaRHeOKX2ZRjC qatkgoD/t13PNT38V9tw3qZVOJDS0W6WM8VSg+F+bkM9LgJ8CmKV+Hj0k3pfGfYPOZJ/v18i +SmZmL/Uw2RghnwDWGAsPCKu4uZR777iw7n9Io6Vfxndw2dcS0e9klvFYoaGS6H2F13Asygr WBzFNGFQscN4mUW+ZYBzpTOcHkdT7w8WS55BmXYLna+dYer9/HaAuUrONjujukN4SPS1fMJ2 /CS/idAUKyyVVX5vozoNK2JVC1h1zUAVsdnmhEzNPsvBoqcVNfyqBFROEVLIPwq+lQMGNVjH ekLTKRWf59MEhUC2ztjSKkGmwdg73d6xSXMuq45EgIJV2wPvOgWQonoHH/kxABEBAAHCwWUE GAECAA8FAlVwZcYCGwwFCRLMAwAACgkQGZU1PhKYC34w5A//YViBtZyDV5O+SJT9FFO3lb9x Zdxf0trA3ooCt7gdBkdnBM6T5EmjgVZ3KYYyFfwXZVkteuCCycMF/zVw5eE9FL1+zz9gg663 nY9q2F77TZTKXVWOLlOV2bY+xaK94U4ytogOGhh9b4UnQ/Ct3+6aviCF78Go608BXbmF/GVT 7uhddemk7ItxM1gE5Hscx3saxGKlayaOsdPKeGTVJCDEtHDuOc7/+jGh5Zxpk/Hpi+DUt1ot 8e6hPYLIQa4uVx4f1xxxV858PQ7QysSLr9pTV7FAQ18JclCaMc7JWIa3homZQL/MNKOfST0S 2e+msuRwQo7AnnfFKBUtb02KwpA4GhWryhkjUh/kbVc1wmGxaU3DgXYQ5GV5+Zf4kk/wqr/7 KG0dkTz6NLCVLyDlmAzuFhf66DJ3zzz4yIo3pbDYi3HB/BwJXVSKB3Ko0oUo+6/qMrOIS02L s++QE/z7K12CCcs7WwOjfCYHK7VtE0Sr/PfybBdTbuDncOuAyAIeIKxdI2nmQHzl035hhvQX s4CSghsP319jAOQiIolCeSbTMD4QWMK8RL/Pe1FI1jC3Nw9s+jq8Dudtbcj2UwAP/STUEbJ9 5rznzuuhPjE0e++EU/RpWmcaIMK/z1zZDMN+ce2v1qzgV936ZhJ3iaVzyqbEE81gDxg3P+IM kiYh4ZtPB4Q= In-Reply-To: <20250130123354.2d767c7c@thor.sb211.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spamd-Result: default: False [-3.79 / 15.00]; RBL_SENDERSCORE_REPUT_9(-1.00)[209.51.186.6:from]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; ONCE_RECEIVED(0.20)[]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : No valid SPF, No valid DKIM,none]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[allanjude]; ASN(0.00)[asn:6939, ipnet:209.51.160.0/19, country:US]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; TO_DOM_EQ_FROM_DOM(0.00)[]; R_DKIM_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; R_SPF_SOFTFAIL(0.00)[~all:c]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_TLS_ALL(0.00)[] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4YkWwq2TzJz3NQl On 1/30/2025 6:35 AM, A FreeBSD User wrote: > Am Wed, 29 Jan 2025 03:45:25 -0800 > David Wolfskill schrieb: > > Hello, thanks for responding. > >> On Wed, Jan 29, 2025 at 11:27:01AM +0100, FreeBSD User wrote: >>> Hello, >>> >>> a ZFS pool (RAINDZ(1)) has been faulted. The pool is not importable >>> anymore. neither with import -F/-f. >>> Although this pool is on an experimental system (no backup available) >>> it contains some data to reconstruct them would take a while, so I'd >>> like to ask whether there is a way to try to "de-fault" such a pool. >> >> Well, 'zpool clear ...' "Clears device errors in a pool." (from "man >> zpool". >> >> It is, however, not magic -- it doesn't actually fix anything. > > For the record: I tried EVERY network/search available method useful for common > "administrators", but hoped people are abe to manipulate deeper stuff via zdb ... > >> >> (I had an issue with a zpool which had a single SSD device as a ZIL; the >> ZIL device failed after it had accepted some data to be written to the >> pool, but before the data could be read and transferred to the spinning >> disks. ZFS was quite unhappy about that. I was eventually able to copy >> the data elsewhere, destroy the old zpool, recreate it *without* that >> single point of failure, then copy the data back. And I learned to >> never create a zpool with a *single* device as a separate ZIL.) > > Well, in this case I do not use dedicated ZIL drives. I also made several experiences with > "single" ZIL drive setups, but a dedicated ZIL is mostly useful in cases were you have > graveyard full of inertia-suffering, mass-spinning HDDs - if I'm right the concept of SSD > based ZIL would be of no use/effect in that case. So I ommited tose. > >> >>> The pool is comprised from 7 drives as a RAIDZ1, one of the SSDs >>> faulted but I pulled the wrong one, so the pool ran into suspended >>> state. >> >> Can you put the drive you pulled back in? > > Every single SSD originally plugged in is now back in place, even the faulted one (which > doesn't report any faults at the moment). > > Although the pool isn't "importable", zdb reports its existence, amongst zroot (which resides > on a dedicated drive). > >> >>> The host is running the lates Xigmanas BETA, which is effectively >>> FreeBSD 14.1-p2, just for the record. >>> >>> I do not want to give up, since I hoped there might be a rude but >>> effective way to restore the pool even under datalosses ... >>> >>> Thanks in advance, >>> >>> Oliver >>> .... >> >> Good luck! >> >> Peace, >> david > > > Well, this is a hard and painful lecture to learn, if there is no chance to get back the pool. > > A warning (but this seems to be useless in the realm of professionals): I used a bunch of > cheap spotmarket SATA SSDs, a brand called "Intenso" common also here in Good old Germany. > Some of those SSDs do have working LED when used with a Fujitsu SAS HBA controller - but those > died very quickly from suffering some bus errors. Another bunch of those SSDs do not have > working LED (not blinking on access), but lasted a bit longer. The problem with those SSDs is: > I can not find the failing device easily by accessing the failed drive by writing massive data > via dd, if possible. > I also ordered alternative SSDs from a more expensive brand - but bad Karma ... > > Oliver > > The most useful thing to share right now would be the output of `zpool import` (with no pool name) on the rebooted system. That will show where the issues are, and suggest how they might be solved. -- Allan Jude