Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled

From: Cy Schubert <Cy.Schubert_at_cschubert.com>
Date: Tue, 04 Apr 2023 20:48:24 UTC
As requested by rpokala, D39418 and D39419 have been submitted.

Again, I am not the author of these. Only the curator.

We can continue our discussion there.


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
NTP:           <cy@nwtime.org>    Web:  https://nwtime.org

			e^(i*pi)+1=0


In message <88b2e6c2-94d7-81f8-f85d-9d7a6b7b8d11@FreeBSD.org>, Martin 
Matuska w
rites:
> I agree that these are three individual fixes.
>
> 1.) pass ap->a_outcred instead of ap->a_fsizetd->td_ucred to 
> zfs_clone_range()
> I am ok with this, the way the argument is subsequently used it should 
> be ap->a_outcred which is intended for the write.
>
> 2.) do a vn_generic_copy_file_range() in case of EXDEV
>
> The comment vn_generic_copy_file_range() says:
> /*
>   * Copy a byte range of one file to another.  This function can handle the
>   * case where invp and outvp are on different file systems.
>   * It can also be called by a VOP_COPY_FILE_RANGE() to do the work, if 
> there
>   * is no better file system specific way to do it.
>   */
>
> That is actually our case. zfs_clone_range() exits with EXDEV if:
> - source and destination are not on the same pool
> - the block_cloning feature is not enabled
> - input and output files have a different block size
> - offset and len are not at block boundaries
> - length is not a multiple of block size, with except for the end of file
> - we are trying to clone a block created in the current transaction group
> - we are cloning encrypted data not in the same dataset
>
> IMO we can fallback to vn_generic_copy_file_range() in all of these cases.
>
> As of the locks, we need to run vn_generic_copy_file_range() on unlocked 
> vnodes,
> just look into the function.
> In both fuse_vnops.c and nfs_clvnops.c it does not run on locked vnodes.
> Even the comment from pjd in zfs_vnops_os.c says:
>          /*
>           * TODO: If offset/length is not aligned to recordsize, use
>           * vn_generic_copy_file_range() on this fragment.
>           * It would be better to do this after we lock the vnodes, b
> ut 
> then we
>           * need something else than vn_generic_copy_file_range().
>           */
>
> So IMO it should be at the end after unlock.
>
> 3.) By doing the feature check early, we save locking the input vnode 
> and calling mac_vnode_check_write() and vn_rlimit_fsize() at the cost of 
> checking for the disabled feature twice. Maybe documented skipping of 
> the check in zfs_clone_range()? The code of the early check looks ok to me.
>
> On 4. 4. 2023 20:18, Cy Schubert wrote:
> > In message <CAGudoHHe3FJmW_cEddozQScJcwfbdbbfEn=y+m6wwmzmvEMb-w@mail.gmail.
> c
> > om>
> > , Mateusz Guzik writes:
> >> can you please post a review
> > I could but I didn't write any of it. Rick Macklem and Martin Matuska wrote
> > it. My patch was for discussion only.
> >
> > Martin and Rick, do you mind if I post this as a review. It should probably
> > be two, maybe three separate commits, fixing two different problems.
> >
> >