From nobody Sun Apr 09 22:35:19 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Pvn4D68fsz44MyK; Sun, 9 Apr 2023 22:35:24 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta001.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Pvn4C703Pz40RC; Sun, 9 Apr 2023 22:35:23 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; none Received: from shw-obgw-4001a.ext.cloudfilter.net ([10.228.9.142]) by cmsmtp with ESMTP id lVAbpxuLjuZMSldd4pD8kw; Sun, 09 Apr 2023 22:35:22 +0000 Received: from spqr.komquats.com ([70.66.148.124]) by cmsmtp with ESMTPA id ldd2pxzZrHFsOldd3pLdrJ; Sun, 09 Apr 2023 22:35:22 +0000 X-Authority-Analysis: v=2.4 cv=XZqaca15 c=1 sm=1 tr=0 ts=64333daa a=Cwc3rblV8FOMdVN/wOAqyQ==:117 a=Cwc3rblV8FOMdVN/wOAqyQ==:17 a=kj9zAlcOel0A:10 a=dKHAf1wccvYA:10 a=pGLkceISAAAA:8 a=6I5d2MoRAAAA:8 a=YxBL1-UpAAAA:8 a=EkcXrb_YAAAA:8 a=DxWW-xz7vxWGE-6_3GgA:9 a=CjuIK1q_8ugA:10 a=Trad_Bmqfy8A:10 a=aHTgUBiGqFQ7EW2Mis07:22 a=IjZwj45LgO3ly-622nXo:22 a=Ia-lj3WSrqcvXOmTRaiG:22 a=LK5xJRSDVpKd5WXXoEvA:22 Received: from slippy.cwsent.com (slippy [10.1.1.91]) by spqr.komquats.com (Postfix) with ESMTP id C4E663DBB; Sun, 9 Apr 2023 15:35:19 -0700 (PDT) Received: by slippy.cwsent.com (Postfix, from userid 1000) id 68D741F3; Sun, 9 Apr 2023 15:35:19 -0700 (PDT) X-Mailer: exmh version 2.9.0 11/07/2018 with nmh-1.8+dev Reply-to: Cy Schubert From: Cy Schubert X-os: FreeBSD X-Sender: cy@cwsent.com X-URL: http://www.cschubert.com/ To: Mateusz Guzik cc: FreeBSD User , Charlie Li , Cy Schubert , Rick Macklem , Martin Matuska , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75 In-reply-to: References: <202304031513.333FD6qw014903@gitrepo.freebsd.org> <20230403231444.CF48911F@slippy.cwsent.com> <20230403232549.73E331A2@slippy.cwsent.com> <20230403235851.84C0467@slippy.cwsent.com> <20230404052811.DA2172C1@slippy.cwsent.com> <7c75b934-cb0a-b32e-bc19-b1e15e8cf3aa@freebsd.org> <20230409202650.49130b92@thor.intern.walstatt.dynvpn.de> Comments: In-reply-to Mateusz Guzik message dated "Mon, 10 Apr 2023 00:15:57 +0200." List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 09 Apr 2023 15:35:19 -0700 Message-Id: <20230409223519.68D741F3@slippy.cwsent.com> X-CMAE-Envelope: MS4xfKivlH12rK/gTCaCJPTboNHandJ2IkkV/+3grQRBAP/BYMA0GfoVXLw6E8HlCXugMKdWe0Op5bqFSQUmAUMa2hGJoo43IGFtpoGnzBkexP0bU8Q9Kv1b Ah1tzy7P6U4uvlEGNpgsLi7n/4SwwNUFoEFNu63jijXNSnjzzUSj7jsm1WxPu7pAgQD5H5dSnuLeJiQK6u6rO9Ux2ptk0wGCPSMpfSKianmgoCldY7XvhPTI 8LTx5WcRBMzy8xRNaFc0z5ISTox+GJlhVBxYqh1rpxPQbEH7wiDdt/77DmEhCB6M4UQk4II6uaDooU2mEFhy6Klg+6H8Ll/4IdTzLQ1Bmbv3t+s8d/qlboVE SZOdrHT/x5VmjrncN0HDMknpQZTq9edXM7u5b+3hUqkQuxztV72MfCOPkKnEKex7jqn6Tl1y4eQAJPkEiiCGplUfP5+3m81sM//u1vsVZRhi6aNGQPw= X-Rspamd-Queue-Id: 4Pvn4C703Pz40RC X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N In message , Mateusz Guzik writes: > On 4/9/23, Mateusz Guzik wrote: > > On 4/9/23, FreeBSD User wrote: > >> Am Sun, 9 Apr 2023 13:23:05 -0400 > >> Charlie Li schrieb: > >> > >>> Mateusz Guzik wrote: > >>> > On 4/9/23, Charlie Li wrote: > >>> >> I've also started noticing random artefacts and malformed files > >>> >> whilst > >>> >> building packages with poudriere, causing all sorts of "exec format > >>> >> error"s, missing .so files due to corruption, data file corruption > >>> >> causing unintended failure modes, etc. All without block_cloning; > >>> >> enabling such causes a panic of its own when starting multiple > >>> >> builder > >>> >> jails at once. > >>> >> > >>> > > >>> > what's the panic? > >>> > > >>> manually typed out: > >>> > >>> panic: VERIFY(!zil_replaying(zilog, tx)) failed > >>> > >>> cpuid = 7 > >>> time = 1681060472 > >>> KDB: stack backtrace: > >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>> 0xfffffe02a05b28a0 > >>> vpanic() at vpanic+0x152/frame 0xfffffe02a05b28f0 > >>> spl_panic() at spl_panic+0x3a/frame 0xfffffe02a05b2950 > >>> zfs_log_clone_range() at zfs_log_clone_range+0x1db/frame > >>> 0xfffffe02a05b29e0 > >>> zfs_clone_range() at zfs_clone_range+0xae2/frame 0xfffffe02a05b2bc0 > >>> zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0xff/frame > >>> 0xfffffe02a05b2c40 > >>> vn_copy_file_range() at vn_copy_file_range+0x115/frame > >>> 0xfffffe02a05b2ce0 > >>> kern_copy_file_range() at kern_copy_file_range+0x34e/frame > >>> 0xfffffe02a05b2db0 > >>> sys_copy_file_range() at sys_copy_file_range+0x78/frame > >>> 0xfffffe02a05b2e00 > >>> amd64_syscall() at amd64_syscall+0x148/frame 0xfffffe02a05b2f30 > >>> fast_syscall_common() at fast_syscall_common+0xf8/frame > >>> 0xfffffe02a05b2f30 > >>> --- syscall (569, FreeBSD ELF64, copy_file_range), rip = 0x908d2a, rsp = > >>> 0x820c28e68, rbp = 0x820c292b0 --- > >>> KDB: enter: panic > >>> [ thread pid 1856 tid 102129 ] > >>> Stopped at kdb_enter+0x32: movq $0,0x12760f3(%rip) > >>> db> > >>> > >> > >> I have the same issue (crash on access of several, but random datasets). > >> > >> It started with /usr/ports build failures when performing updates or > >> rebuilding ports, > >> poudriere host doesn't work anymore, as soon as started building ports, > >> the > >> hosts (several of > >> them, same OS revision, new ZFS option enabled) crash. > >> Also when building binaries for an pkg OS distribution. > >> > >> That host also reports a ZFS RAIDZ pool as corrupted, out of the blue! > >> Some > >> files from a > >> poudriere build and /usr/ports build seem to have issues with some > >> temporarily created files > >> in work directory. > >> > >> On another host /usr/ports is residing on ZFS and it crashes also when > >> building/updating ports > >> (/usr/ports residing on ZFS) - but on the same host /home is also > >> residing > >> on ZFS, but even > >> downloading large amounts of emails, the host seem to be stable. Have not > >> found out yet what > >> kind of file access triggers the crash. > >> > > > > I reproduced the VERIFY(!zil_replaying(zilog, tx)) panic. As the > > backtrace shows it triggers when using copy_file_range, I temporarily > > patched the kernel to never do block cloning. So far the only package > > which failed to build was sqlite and it was for a legitimate reason > > (compiler errored out due to a problem in the code). > > > > ... and got an illegitimate failure: > strip: file format not recognized > > the port builds after retrying > > iow there is more breakage. > > i don't know if the merge can be easily reverted now, will have to see > about that git revert is the easy part. What about people who've done zpool upgrade, and following the revert have read-only zpools? Personally, I typically avoid enabling new zpool features for the first few weeks, even months, just in case. But not everyone does this. People who've done zpool upgrade already will need to back up their zpools and restore them following any upgrade to a FreeBSD with reverted zfs commits. And, considering the above, we may be long past the point of no return. For me, personally, it won't matter either way. For others? I don't know. Maybe simply disabling block_cloning regardless of the zpool setting might be a less disruptive solution. What is the ZFS project issue number? -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0