From nobody Tue Apr 04 16:18:44 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PrXy04m0Fz43pDl; Tue, 4 Apr 2023 16:18:48 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta002.cacentral1.a.cloudfilter.net (omta002.cacentral1.a.cloudfilter.net [3.97.99.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PrXy01prcz3r2K; Tue, 4 Apr 2023 16:18:48 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; none Received: from shw-obgw-4001a.ext.cloudfilter.net ([10.228.9.142]) by cmsmtp with ESMTP id jiPKpvlLCjvm1jjMtpj6jy; Tue, 04 Apr 2023 16:18:47 +0000 Received: from spqr.komquats.com ([70.66.148.124]) by cmsmtp with ESMTPA id jjMrpMtFkHFsOjjMspAWAy; Tue, 04 Apr 2023 16:18:47 +0000 X-Authority-Analysis: v=2.4 cv=XZqaca15 c=1 sm=1 tr=0 ts=642c4de7 a=Cwc3rblV8FOMdVN/wOAqyQ==:117 a=Cwc3rblV8FOMdVN/wOAqyQ==:17 a=xqWC_Br6kY4A:10 a=dKHAf1wccvYA:10 a=6I5d2MoRAAAA:8 a=YxBL1-UpAAAA:8 a=EkcXrb_YAAAA:8 a=pGLkceISAAAA:8 a=hF2rLc1pAAAA:8 a=7Ye_p2jU87Th3mEFifcA:9 a=QEXdDO2ut3YA:10 a=zqp9dbbw13EA:10 a=wQnKCjUA6w2wFo1OGagA:9 a=IjZwj45LgO3ly-622nXo:22 a=Ia-lj3WSrqcvXOmTRaiG:22 a=LK5xJRSDVpKd5WXXoEvA:22 a=O9OM7dhJW_8Hj9EqqvKN:22 Received: from slippy.cwsent.com (slippy [10.1.1.91]) by spqr.komquats.com (Postfix) with ESMTP id 1D118B6E; Tue, 4 Apr 2023 09:18:45 -0700 (PDT) Received: from slippy (localhost [IPv6:::1]) by slippy.cwsent.com (Postfix) with ESMTP id 88B7532E; Tue, 4 Apr 2023 09:18:44 -0700 (PDT) Date: Tue, 4 Apr 2023 09:18:44 -0700 From: Cy Schubert To: Martin Matuska Cc: Rick Macklem , Mateusz Guzik , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled Message-ID: <20230404091844.639cb1c1@slippy> In-Reply-To: <98c71e6f-5b48-79f3-e7b0-95d674949624@FreeBSD.org> References: <202304041145.334Bjx6l035872@gitrepo.freebsd.org> <20230404141717.B976D31C@slippy.cwsent.com> <98c71e6f-5b48-79f3-e7b0-95d674949624@FreeBSD.org> Organization: KOMQUATS X-Mailer: Claws Mail 3.19.0 (GTK+ 2.24.33; amd64-portbld-freebsd14.0) List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/_7hN69E4dFfh=Qg2zb=GRdp" X-CMAE-Envelope: MS4xfGUPbOeuhhB5wFMGGM6bQc1fOXAmHk25QC2SrxKP5xgescdAUGns9GKCi+fMLoLhxCML8wao59wMHALhhk0SfcLjnAGmq32mdNf6N9dXFOS5IhnO5zJA GK++tcRmJHxtcT5L6vl/yvtAV8xmqc20m99/7LDw6Wj89ogBo0K2ROSUIFt1GJCESJkoalkx7d5AtXrR91BLeG7/SFqCNzU7Du3u3btBVGmbyWg3IttdiS8+ 7Ht9xqqlifTh7RMU8s23ClehZ1JvoJTillRRVNS+I0KE1I5Jjp0db3WLtt2k5oyTL4VTfsGEdggap6ckAQCLM51cdtCv3Es7qkatBdn9dsyrb1kgXCfIku+k aCd0V4MrYZ7rNeyq9Dt9Yaaf0/oQDuUzbdD/ACwxZznk14zzSBs= X-Rspamd-Queue-Id: 4PrXy01prcz3r2K X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --MP_/_7hN69E4dFfh=Qg2zb=GRdp Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Tue, 4 Apr 2023 17:30:25 +0200 Martin Matuska wrote: > So I am now a little bit confused - what is the consensus? :-) My exmh email client made a mess of that. Let's try this again. Rick has posted a patch. Your patch should also be incorporated to work=20 around other EXDEV errors, but a few lines earlier so it is protected by=20 the lock. There were a couple of typos in Rick's patch (a missing keystroke;=20 s/ojset/objset/). The patch (Rick's null pointer dereference fix, Rick's copy file range=20 patch plus your copy file range patch) builds fine on amd64 and i386.=20 Installing and testing it now. A combination of all three patches is attached. It's compile tested but is= =20 currently being installed and will be tested when install is completed. --=20 Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=3D0 >=20 > On 4. 4. 2023 17:26, Rick Macklem wrote: > > On Tue, Apr 4, 2023 at 7:38=E2=80=AFAM Mateusz Guzik wrote: =20 > >> CAUTION: This email originated from outside of the University of Guelp= h. Do not click links or open attachments unless you recognize the sender a= nd know the content is safe. If in doubt, forward suspicious emails to IThe= lp@uoguelph.ca > >> > >> > >> On 4/4/23, Cy Schubert wrote: =20 > >>> In message <202304041145.334Bjx6l035872@gitrepo.freebsd.org>, Martin > >>> Matuska wr > >>> ites: =20 > >>>> The branch main has been updated by mm: > >>>> > >>>> URL: > >>>> https://cgit.FreeBSD.org/src/commit/?id=3D8ee579abe09ec1fe15c588fc9a= 08370b > >>>> 83b81cd6 > >>>> > >>>> commit 8ee579abe09ec1fe15c588fc9a08370b83b81cd6 > >>>> Author: Martin Matuska > >>>> AuthorDate: 2023-04-04 11:40:41 +0000 > >>>> Commit: Martin Matuska > >>>> CommitDate: 2023-04-04 11:43:34 +0000 > >>>> > >>>> zfs: fall back if block_cloning feature is disabled > >>>> > >>>> If block_cloning is disabled, or other errors from zfs_clone_ra= nge() > >>>> return an EXDEV we should fall back to vn_generic_copy_file_ran= ge(). > >>>> > >>>> This fixes issues when copying files on the same dataset with > >>>> block_cloning disabled. > >>>> > >>>> Upstreamed as pull request to OpenZFS. > >>>> > >>>> Reviewed by: Mateusz Guzik > >>>> OpenZFS pull request: 14713 > >>>> --- > >>>> .../openzfs/module/os/freebsd/zfs/zfs_vnops_os.c | 17 > >>>> ++++++++++----- > >>>> -- > >>>> 1 file changed, 10 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> b/sys/c > >>>> ontrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> index 97429b360a36..2cd1d27e37bc 100644 > >>>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> @@ -6243,13 +6243,6 @@ zfs_freebsd_copy_file_range(struct > >>>> vop_copy_file_range > >>>> _args *ap) > >>>> int error; > >>>> uint64_t len =3D *ap->a_lenp; > >>>> > >>>> - /* > >>>> - * TODO: If offset/length is not aligned to recordsize, use > >>>> - * vn_generic_copy_file_range() on this fragment. > >>>> - * It would be better to do this after we lock the vnodes, but = then we > >>>> - * need something else than vn_generic_copy_file_range(). > >>>> - */ > >>>> - > >>>> /* Lock both vnodes, avoiding risk of deadlock. */ > >>>> do { > >>>> mp =3D NULL; > >>>> @@ -6300,6 +6293,16 @@ unlock: > >>>> if (mp !=3D NULL) > >>>> vn_finished_write(mp); > >>>> > >>>> + /* > >>>> + * Fall back if block_cloning feature is disabled > >>>> + * or other EXDEV failures from zfs_vnops.c > >>>> + */ > >>>> + if (error =3D=3D EXDEV) { > >>>> + error =3D vn_generic_copy_file_range(ap->a_invp, ap->a_= inoffp, > >>>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap-= >a_flags > >>>> , > >>>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd); > >>>> + } > >>>> + > >>>> return (error); > >>>> } > >>>> > >>>> =20 > >>> This is too late to fall back. On Rick's suggestion the following mak= es the > >>> > >>> determination at > >>> zfs_freebsd_copy_file_range() entry much earlier. > >>> =20 > >> It's not too late, but I agree it is faster to bail out early. > >> > >> The proposed patch adds a condition which *differs* from the one in > >> zfs_clone_range: > >> if (dmu_objset_spa(inos) !=3D dmu_objset_spa(outos)) { > >> zfs_exit_two(inzfsvfs, outzfsvfs, FTAG); > >> return (SET_ERROR(EXDEV)); > >> } > >> > >> ... meaning with the proposed patch the routine can still fail with > >> EXDEV, making zfs_freebsd_copy_file_range also do it, which must not > >> happen. =20 > > Since VOP_COPY_FILE_RANGE() is only called when invp and outvp > > are on the same mount point, I don't think this can happen now. > > However, there is a TO DO comment that suggests a call with invp and > > outvp on different mount points may be in the future. > > > > As such, leaving Martin's patch in so that it calls vn_generic_copy_fil= e_range() > > when zfs_clone_range() returns EXDEV seems like a good idea to me. > > =20 > >> That aside the code looks rather suspicious for the case where target > >> and source vnode are the same. iow more work is needed here. =20 > > Definitely needs to be tested. I'll do that later to-day. > > > > rick > > =20 > >> As the vnode is unlocked, you *can't* safely access zfsvfs_t > >> *outzfsvfs =3D ZTOZSB(outzp); in that spot in this manner -- a forced > >> unmount at the same time can free it. > >> > >> iow this patch does *NOT* work. > >> > >> With the committed variant the situation is damage controlled enough > >> that there is time to sort it out correctly. > >> =20 > >>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> index d41821ff67f1..e18dcca58192 100644 > >>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> @@ -6243,6 +6243,18 @@ zfs_freebsd_copy_file_range(struct > >>> vop_copy_file_range_args *ap) > >>> int error; > >>> uint64_t len =3D *ap->a_lenp; > >>> > >>> + znode_t *outzp =3D VTOZ(ap->a_outvp); > >>> + zfsvfs_t *outzfsvfs =3D ZTOZSB(outzp); > >>> + objset_t *outos =3D outzfsvfs->z_os; > >>> + > >>> + if (!spa_feature_is_enabled(dmu_objset_spa(outos), > >>> + SPA_FEATURE_BLOCK_CLONING)) { > >>> + error =3D vn_generic_copy_file_range(ap->a_invp, ap->a_= inoffp, > >>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_f= lags, > >>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd); > >>> + return (error); > >>> + } > >>> + > >>> /* > >>> * TODO: If offset/length is not aligned to recordsize, use > >>> * vn_generic_copy_file_range() on this fragment. > >>> > >>> > >>> Can you revert your commit and commit this, please. > >>> > >>> > >>> -- > >>> Cheers, > >>> Cy Schubert > >>> FreeBSD UNIX: Web: https://FreeBSD.org > >>> NTP: Web: https://nwtime.org > >>> > >>> e^(i*pi)+1=3D0 > >>> > >>> > >>> > >>> =20 > >> > >> -- > >> Mateusz Guzik =20 --MP_/_7hN69E4dFfh=Qg2zb=GRdp Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=zfs-jumbo.patch diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c index 97429b360a36..16e0176be2ff 100644 --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c @@ -6242,6 +6242,30 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap) struct uio io; int error; uint64_t len = *ap->a_lenp; + zfsvfs_t *outzfsvfs; + objset_t *outos; + bool done_outvp; + + mp = NULL; + error = vn_start_write(outvp, &mp, V_WAIT); + if (error == 0) + error = vn_lock(outvp, LK_EXCLUSIVE); + done_outvp = true; + if (error == 0) { + outzfsvfs = ZTOZSB(VTOZ(outvp)); + outos = outzfsvfs->z_os; + if (!spa_feature_is_enabled(dmu_objset_spa(outos), + SPA_FEATURE_BLOCK_CLONING)) { + VOP_UNLOCK(outvp); + if (mp != NULL) + vn_finished_write(mp); + error = vn_generic_copy_file_range(ap->a_invp, + ap->a_inoffp, ap->a_outvp, ap->a_outoffp, + ap->a_lenp, ap->a_flags, ap->a_incred, + ap->a_outcred, ap->a_fsizetd); + return (error); + } + } /* * TODO: If offset/length is not aligned to recordsize, use @@ -6252,27 +6276,29 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap) /* Lock both vnodes, avoiding risk of deadlock. */ do { - mp = NULL; - error = vn_start_write(outvp, &mp, V_WAIT); + if (!done_outvp) { + mp = NULL; + error = vn_start_write(outvp, &mp, V_WAIT); + if (error == 0) + error = vn_lock(outvp, LK_EXCLUSIVE); + } if (error == 0) { - error = vn_lock(outvp, LK_EXCLUSIVE); - if (error == 0) { - if (invp == outvp) - break; - error = vn_lock(invp, LK_SHARED | LK_NOWAIT); - if (error == 0) - break; - VOP_UNLOCK(outvp); - if (mp != NULL) - vn_finished_write(mp); - mp = NULL; - error = vn_lock(invp, LK_SHARED); - if (error == 0) - VOP_UNLOCK(invp); - } + if (invp == outvp) + break; + error = vn_lock(invp, LK_SHARED | LK_NOWAIT); + if (error == 0) + break; + VOP_UNLOCK(outvp); + if (mp != NULL) + vn_finished_write(mp); + mp = NULL; + error = vn_lock(invp, LK_SHARED); + if (error == 0) + VOP_UNLOCK(invp); } if (mp != NULL) vn_finished_write(mp); + done_outvp = false; } while (error == 0); if (error != 0) return (error); @@ -6290,7 +6316,12 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap) goto unlock; error = zfs_clone_range(VTOZ(invp), ap->a_inoffp, VTOZ(outvp), - ap->a_outoffp, &len, ap->a_fsizetd->td_ucred); + ap->a_outoffp, &len, ap->a_outcred); + if (error == EXDEV) + error = vn_generic_copy_file_range(ap->a_invp, + ap->a_inoffp, ap->a_outvp, ap->a_outoffp, + ap->a_lenp, ap->a_flags, ap->a_incred, + ap->a_outcred, ap->a_fsizetd); *ap->a_lenp = (size_t)len; unlock: --MP_/_7hN69E4dFfh=Qg2zb=GRdp--