Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and crashing)

From: Cy Schubert <Cy.Schubert_at_cschubert.com>
Date: Tue, 11 Apr 2023 14:28:31 UTC
In message <434B83DB-F6BB-436F-8AA5-385730D20BB1@dawidek.net>, 
=?utf-8?Q?Pawe=C
5=82_Jakub_Dawidek?= writes:
> 
>
> > On Apr 11, 2023, at 11:31, Cy Schubert <Cy.Schubert@cschubert.com> wrote:
> >=20
> > =EF=BB=BFIn message <20230409161436.5412fa6e@thor.intern.walstatt.dynvpn.d=
> e>,=20
> > FreeBSD Us
> > er writes:
> >> Am Sun, 9 Apr 2023 14:37:03 +0200
> >> Mateusz Guzik <mjguzik@gmail.com> schrieb:
> >>=20
> >>>> On 4/9/23, FreeBSD User <freebsd@walstatt-de.de> wrote:
> >>>>> Today, after upgrading to FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e=
> 301
> >>> 2b:
> >>>>> Sun Apr  9
> >>>>> 12:01:02 CEST 2023  amd64, AND upgrading ZPOOLs via
> >>>>>=20
> >>>>> zpool upgrade POOLNAME
> >>>>>=20
> >>>>> some boxes keep crashing when starting compiler runs (the trigger is
> >>>>> different on boxes).
> >>>>>=20
> >>>>> ZFS module is statically compiled into the kernel (if this is of
> >>>>> importance)
> >>>>>=20
> >>>>> Last known good was:
> >>>>>=20
> >>>>> [...]
> >>>>> Apr  9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7
> >>>>> main-n262051-75379ea2e461: Sun Apr
> >>>>> 9 00:12:57 CEST 2023 Apr  9 07:10:04 <0.2> thor kernel:
> >>>>> root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr  9 07:10:04 <
> =
> 0.
> >>> 2>
> >>>>> thor kernel:
> >>>>> FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git=
>
> >>>>> llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr  9 07:10:04 <0.2> thor kernel:
> >>>>> VT(efifb): resolution
> >>>>> 2560x1440 Apr  9 07:10:04 <0.2> thor kernel: module zfsctrl already
> >>>>> present!
> >>>>> [...]
> >>>>>=20
> >>>>> The file /var/crash/info.X
> >>>>>=20
> >>>>> contains:
> >>>>>=20
> >>>>> [...]
> >>>>>=20
> >>>>> root@thor:/var/crash # more info.2
> >>>>> Dump header from device: /dev/gpt/swap
> >>>>>  Architecture: amd64
> >>>>>  Architecture Version: 2
> >>>>>  Dump Length: 1095192576
> >>>>>  Blocksize: 512
> >>>>>  Compression: none
> >>>>>  Dumptime: 2023-04-09 11:43:41 +0000
> >>>>>  Hostname: thor.local
> >>>>>  Magic: FreeBSD Kernel Dump
> >>>>>  Version String: FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e3012b: Su=
> n=20
> >>> Apr
> >>>>> 9 12:01:02 CEST
> >>>>> 2023
> >>>>>    root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR
> >>>>>  Panic String: VERIFY(!zil_replaying(zilog, tx)) failed
> >>>>>=20
> >>>>>  Dump Parity: 2961465682
> >>>>>  Bounds: 2
> >>>>>  Dump Status: good
> >>>>>=20
> >>>>> Until reconfigured for more debug stuff I do not have more to present.=
>
> >>>>>=20
> >>>>> I rememeber now really scraed that there was a HEADSUP in the list reg=
> ard
> >>> ing
> >>>>> some serious ZFS
> >>>>> problems - I didn't find it right now.
> >>>>>=20
> >>>>> Thanks in advance,
> >>>>>=20
> >>>=20
> >>> That's fallout from the new block cloning feature, adding the author
> >>>=20
> >>=20
> >> Thanks.
> >>=20
> >> As of this moment, all systems with the newest kernel and the new ZFS opt=
> ion=20
> >> enabled, crash -
> >> the reason is mostly in  different ZFS datasets. I guess there is no way b
> =
> ack
> >> once this faulty
> >> option is enabled?
> >=20
> > I've run a test on a scratch pool here, first without block_cloning=20
> > enabled, then with. There was no corruption when block_cloning was=20
> > disabled. There was corruption when block_cloning was enabled.
> >=20
> > I don't know of any way to revert back nor is there any way to fix or=20
> > recover the corrupted blocks.
>
> Is the corruption still present after EXDEV fixes?

Yes and no.

Yes, there is corruption when block_cloning is enabled.

There is no corruption when block_cloning is disabled.


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
NTP:           <cy@nwtime.org>    Web:  https://nwtime.org

			e^(i*pi)+1=0