Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Tue, 04 Apr 2023 14:37:56 UTC
On 4/4/23, Cy Schubert <Cy.Schubert@cschubert.com> wrote:
> In message <202304041145.334Bjx6l035872@gitrepo.freebsd.org>, Martin
> Matuska wr
> ites:
>> The branch main has been updated by mm:
>>
>> URL:
>> https://cgit.FreeBSD.org/src/commit/?id=8ee579abe09ec1fe15c588fc9a08370b
>> 83b81cd6
>>
>> commit 8ee579abe09ec1fe15c588fc9a08370b83b81cd6
>> Author:     Martin Matuska <mm@FreeBSD.org>
>> AuthorDate: 2023-04-04 11:40:41 +0000
>> Commit:     Martin Matuska <mm@FreeBSD.org>
>> CommitDate: 2023-04-04 11:43:34 +0000
>>
>>     zfs: fall back if block_cloning feature is disabled
>>
>>     If block_cloning is disabled, or other errors from zfs_clone_range()
>>     return an EXDEV we should fall back to vn_generic_copy_file_range().
>>
>>     This fixes issues when copying files on the same dataset with
>>     block_cloning disabled.
>>
>>     Upstreamed as pull request to OpenZFS.
>>
>>     Reviewed by:    Mateusz Guzik <mjguzik@gmail.com>
>>     OpenZFS pull request:   14713
>> ---
>>  .../openzfs/module/os/freebsd/zfs/zfs_vnops_os.c        | 17
>> ++++++++++-----
>> --
>>  1 file changed, 10 insertions(+), 7 deletions(-)
>>
>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
>> b/sys/c
>> ontrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
>> index 97429b360a36..2cd1d27e37bc 100644
>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
>> @@ -6243,13 +6243,6 @@ zfs_freebsd_copy_file_range(struct
>> vop_copy_file_range
>> _args *ap)
>>  	int error;
>>  	uint64_t len = *ap->a_lenp;
>>
>> -	/*
>> -	 * TODO: If offset/length is not aligned to recordsize, use
>> -	 * vn_generic_copy_file_range() on this fragment.
>> -	 * It would be better to do this after we lock the vnodes, but then we
>> -	 * need something else than vn_generic_copy_file_range().
>> -	 */
>> -
>>  	/* Lock both vnodes, avoiding risk of deadlock. */
>>  	do {
>>  		mp = NULL;
>> @@ -6300,6 +6293,16 @@ unlock:
>>  	if (mp != NULL)
>>  		vn_finished_write(mp);
>>
>> +	/*
>> +	 * Fall back if block_cloning feature is disabled
>> +	 * or other EXDEV failures from zfs_vnops.c
>> +	 */
>> +	if (error == EXDEV) {
>> +		error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp,
>> +			    ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_flags
>> ,
>> +			    ap->a_incred, ap->a_outcred, ap->a_fsizetd);
>> +	}
>> +
>>  	return (error);
>>  }
>>
>>
>
> This is too late to fall back. On Rick's suggestion the following makes the
>
> determination at
> zfs_freebsd_copy_file_range() entry much earlier.
>

It's not too late, but I agree it is faster to bail out early.

The proposed patch adds a condition which *differs* from the one in
zfs_clone_range:
        if (dmu_objset_spa(inos) != dmu_objset_spa(outos)) {
                zfs_exit_two(inzfsvfs, outzfsvfs, FTAG);
                return (SET_ERROR(EXDEV));
        }

... meaning with the proposed patch the routine can still fail with
EXDEV, making zfs_freebsd_copy_file_range also do it, which must not
happen.

That aside the code looks rather suspicious for the case where target
and source vnode are the same. iow more work is needed here.

As the vnode is unlocked, you *can't* safely access zfsvfs_t
*outzfsvfs = ZTOZSB(outzp); in that spot in this manner -- a forced
unmount at the same time can free it.

iow this patch does *NOT* work.

With the committed variant the situation is damage controlled enough
that there is time to sort it out correctly.

> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> index d41821ff67f1..e18dcca58192 100644
> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> @@ -6243,6 +6243,18 @@ zfs_freebsd_copy_file_range(struct
> vop_copy_file_range_args *ap)
>  	int error;
>  	uint64_t len = *ap->a_lenp;
>
> +	znode_t *outzp = VTOZ(ap->a_outvp);
> +	zfsvfs_t *outzfsvfs = ZTOZSB(outzp);
> +	objset_t *outos = outzfsvfs->z_os;
> +
> +        if (!spa_feature_is_enabled(dmu_objset_spa(outos),
> +            SPA_FEATURE_BLOCK_CLONING)) {
> +		error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp,
> +			ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_flags,
> +			ap->a_incred, ap->a_outcred, ap->a_fsizetd);
> +                return (error);
> +        }
> +
>  	/*
>  	 * TODO: If offset/length is not aligned to recordsize, use
>  	 * vn_generic_copy_file_range() on this fragment.
>
>
> Can you revert your commit and commit this, please.
>
>
> --
> Cheers,
> Cy Schubert <Cy.Schubert@cschubert.com>
> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
>
> 			e^(i*pi)+1=0
>
>
>
>


-- 
Mateusz Guzik <mjguzik gmail.com>