Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled
- In reply to: Martin Matuska : "Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 04 Apr 2023 16:14:53 UTC
Rick has posted a patch. Your patch should also be incorporated to work around other EXDEV errors, but a few lines earlier so it is protected by the lock. There were a couple of typos in Rick's patch (a missing keystroke; s/ojset/objset/). The patch (Rick's null pointer dereference fix, Rick's copy file range patch plus your copy file range patch) builds fine on amd64 and i386. Installing and testing it now. A combination of all three patches is attached. It's compile tested but is currently being installed and will be tested when install is completed. In message <98c71e6f-5b48-79f3-e7b0-95d674949624@FreeBSD.org>, Martin Matuska w rites: > So I am now a little bit confused - what is the consensus? :-) > > On 4. 4. 2023 17:26, Rick Macklem wrote: > > On Tue, Apr 4, 2023 at 7:38 AM Mateusz Guzik <mjguzik@gmail.com> wrote: > >> CAUTION: This email originated from outside of the University of Guelph. D > o not click links or open attachments unless you recognize the sender and kno > w the content is safe. If in doubt, forward suspicious emails to IThelp@uogue > lph.ca > >> > >> > >> On 4/4/23, Cy Schubert <Cy.Schubert@cschubert.com> wrote: > >>> In message <202304041145.334Bjx6l035872@gitrepo.freebsd.org>, Martin > >>> Matuska wr > >>> ites: > >>>> The branch main has been updated by mm: > >>>> > >>>> URL: > >>>> https://cgit.FreeBSD.org/src/commit/?id=8ee579abe09ec1fe15c588fc9a08370b > >>>> 83b81cd6 > >>>> > >>>> commit 8ee579abe09ec1fe15c588fc9a08370b83b81cd6 > >>>> Author: Martin Matuska <mm@FreeBSD.org> > >>>> AuthorDate: 2023-04-04 11:40:41 +0000 > >>>> Commit: Martin Matuska <mm@FreeBSD.org> > >>>> CommitDate: 2023-04-04 11:43:34 +0000 > >>>> > >>>> zfs: fall back if block_cloning feature is disabled > >>>> > >>>> If block_cloning is disabled, or other errors from zfs_clone_range( > ) > >>>> return an EXDEV we should fall back to vn_generic_copy_file_range() > . > >>>> > >>>> This fixes issues when copying files on the same dataset with > >>>> block_cloning disabled. > >>>> > >>>> Upstreamed as pull request to OpenZFS. > >>>> > >>>> Reviewed by: Mateusz Guzik <mjguzik@gmail.com> > >>>> OpenZFS pull request: 14713 > >>>> --- > >>>> .../openzfs/module/os/freebsd/zfs/zfs_vnops_os.c | 17 > >>>> ++++++++++----- > >>>> -- > >>>> 1 file changed, 10 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> b/sys/c > >>>> ontrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> index 97429b360a36..2cd1d27e37bc 100644 > >>>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>>> @@ -6243,13 +6243,6 @@ zfs_freebsd_copy_file_range(struct > >>>> vop_copy_file_range > >>>> _args *ap) > >>>> int error; > >>>> uint64_t len = *ap->a_lenp; > >>>> > >>>> - /* > >>>> - * TODO: If offset/length is not aligned to recordsize, use > >>>> - * vn_generic_copy_file_range() on this fragment. > >>>> - * It would be better to do this after we lock the vnodes, but then > we > >>>> - * need something else than vn_generic_copy_file_range(). > >>>> - */ > >>>> - > >>>> /* Lock both vnodes, avoiding risk of deadlock. */ > >>>> do { > >>>> mp = NULL; > >>>> @@ -6300,6 +6293,16 @@ unlock: > >>>> if (mp != NULL) > >>>> vn_finished_write(mp); > >>>> > >>>> + /* > >>>> + * Fall back if block_cloning feature is disabled > >>>> + * or other EXDEV failures from zfs_vnops.c > >>>> + */ > >>>> + if (error == EXDEV) { > >>>> + error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp > , > >>>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_f > lags > >>>> , > >>>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd); > >>>> + } > >>>> + > >>>> return (error); > >>>> } > >>>> > >>>> > >>> This is too late to fall back. On Rick's suggestion the following makes t > he > >>> > >>> determination at > >>> zfs_freebsd_copy_file_range() entry much earlier. > >>> > >> It's not too late, but I agree it is faster to bail out early. > >> > >> The proposed patch adds a condition which *differs* from the one in > >> zfs_clone_range: > >> if (dmu_objset_spa(inos) != dmu_objset_spa(outos)) { > >> zfs_exit_two(inzfsvfs, outzfsvfs, FTAG); > >> return (SET_ERROR(EXDEV)); > >> } > >> > >> ... meaning with the proposed patch the routine can still fail with > >> EXDEV, making zfs_freebsd_copy_file_range also do it, which must not > >> happen. > > Since VOP_COPY_FILE_RANGE() is only called when invp and outvp > > are on the same mount point, I don't think this can happen now. > > However, there is a TO DO comment that suggests a call with invp and > > outvp on different mount points may be in the future. > > > > As such, leaving Martin's patch in so that it calls vn_generic_copy_file_ra > nge() > > when zfs_clone_range() returns EXDEV seems like a good idea to me. > > > >> That aside the code looks rather suspicious for the case where target > >> and source vnode are the same. iow more work is needed here. > > Definitely needs to be tested. I'll do that later to-day. > > > > rick > > > >> As the vnode is unlocked, you *can't* safely access zfsvfs_t > >> *outzfsvfs = ZTOZSB(outzp); in that spot in this manner -- a forced > >> unmount at the same time can free it. > >> > >> iow this patch does *NOT* work. > >> > >> With the committed variant the situation is damage controlled enough > >> that there is time to sort it out correctly. > >> > >>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> index d41821ff67f1..e18dcca58192 100644 > >>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > >>> @@ -6243,6 +6243,18 @@ zfs_freebsd_copy_file_range(struct > >>> vop_copy_file_range_args *ap) > >>> int error; > >>> uint64_t len = *ap->a_lenp; > >>> > >>> + znode_t *outzp = VTOZ(ap->a_outvp); > >>> + zfsvfs_t *outzfsvfs = ZTOZSB(outzp); > >>> + objset_t *outos = outzfsvfs->z_os; > >>> + > >>> + if (!spa_feature_is_enabled(dmu_objset_spa(outos), > >>> + SPA_FEATURE_BLOCK_CLONING)) { > >>> + error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp > , > >>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_flags > , > >>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd); > >>> + return (error); > >>> + } > >>> + > >>> /* > >>> * TODO: If offset/length is not aligned to recordsize, use > >>> * vn_gene> >>> > >>> > >>> Can you revert your commit and commit this, please. > >>> > >>> > >>> -- > >>> Cheers, > >>> Cy Schubert <Cy.Schubert@cschubert.com> > >>> FreeBSD UNIX: <cy@FreeBSD.org> Web: https://FreeBSD.org > >>> NTP: <cy@nwtime.org> Web: https://nwtime.org > >>> > >>> e^(i*pi)+1=0 > >>> > >>> > >>> > >>> > >> > >> -- > >> Mateusz Guzik <mjguzik gmail.com>