Re: 13-stable NFS server hang
- In reply to: Garrett Wollman : "Re: 13-stable NFS server hang"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 03 Mar 2024 21:14:44 UTC
On Sat, Mar 2, 2024 at 9:25 PM Garrett Wollman <wollman@bimajority.org> wrote: > > <<On Sat, 2 Mar 2024 23:28:20 -0500, I wrote: > > > I believe this explains why vn_copy_file_range sometimes takes much > > longer than a second: our servers often have lots of data waiting to > > be written to disk, and if the file being copied was recently modified > > (and so is dirty), this might take several seconds. I've set > > vfs.zfs.dmu_offset_next_sync=0 on the server that was hurting the most > > and am watching to see if we have more freezes. > > In case anyone is wondering why this is an issue, it's the combination > of two factors: > > 1) vn_generic_copy_file_range() attempts to preserve holes in the > source file. Just fyi, when I was first doing the copy_file_range(2) syscall, the discussion seemed to think this was a reasonable thing to do. It is now not so obvious for file systems doing compression, such as ZFS. It happens that ZFS will no longer use vn_generic_copy_file_range() when block cloning is enabled and I have no idea what block cloning does w.r.t. preserving holes. For non-compression file systems, comparing va_size with va_bytes should serve as a reasonable hint w.r.t. the file being sparse. If the file is not sparse, vn_generic_copy_file_range() should not bother doing SEEK_DATA/SEEK_HOLE. (I had intended to do such a patch, but I cannot now remember if I did do so. I'll take a look.) Note that this patch would not affect ZFS, but could improve UFS performaince where vn_generic_copy_file_range() is used to do the copying. rick > > 2) ZFS does automatic hole-punching on write for filesystems where > compression is enabled. It happens in the same code path as > compression, checksum generation, and redundant-write suppression, and > thus does not happen until the dirty blocks are about to be committed > to disk. So if the file is dirty, ZFS doesn't "know" whether thare > where the then-extant holes are until a sync has completed. > > While vn_generic_copy_file_range() has a flag to stop and return > partial success after a second of copying, this flag does not affect > sleeps internal to the filesystem, so zfs_holey() can sleep > indefinitely and vn_generic_copy_file_range() can't do anything about > it until the sync has already happened. > > -GAWollman >