Re: RFC: Should copy_file_range(2) work for shared memory objects?
Date: Wed, 20 Sep 2023 23:52:20 UTC
On Wed, Sep 20, 2023 at 4:47 PM Rick Macklem <rick.macklem@gmail.com> wrote: > > On Wed, Sep 20, 2023 at 4:21 PM Rick Macklem <rick.macklem@gmail.com> wrote: > > > > On Wed, Sep 20, 2023 at 4:09 PM Rick Macklem <rick.macklem@gmail.com> wrote: > > > > > > On Wed, Sep 20, 2023 at 3:07 PM Alan Somers <asomers@freebsd.org> wrote: > > > > > > > > On Wed, Sep 20, 2023 at 3:05 PM Rick Macklem <rick.macklem@gmail.com> wrote: > > > > > > > > > > Right now (as noted by PR#273962) copy_file_range(2) > > > > > fails for shared memory objects because there is no > > > > > vnode (f_vnode == NULL) for them and the code uses > > > > > vnodes (including a file system specific VOP_COPY_FILE_RANGE(9)). > > > > > > > > > > Do you think copy_file_range(2) should work for shared memory objects? > > > > > > > > > > This would require specific handling in kern_copy_file_range() > > > > > to work. I do not think the patch would be a lot of work, but > > > > > I am not familiar with the f_ops and shared memory code. > > > > > > > > > > rick > > > > > > > > This sounds annoying to fix. But I think we ought to. Right now > > > > programmers can assume that copy_file_range will work for every type > > > > of file. We don't document an EOPNOTSUP error code or anything like > > > > that. Does it work on sockets, too? > > > No. I guess I have a different definition of "file" (unless you meant > > > "filedesc"?). I cannot see how a "range is defined for sockets > > > or named pipes or...". It currently checks for a f_vnode, which > > > probably is not enough. (I haven't figured out what path_fileops > > > are, so I do not know if they work?) > > > > > > I can see how it can be implemented for shared memory objects. > > > However, this is going to take a fair amount of work, since they > > > do not use vnodes. > > > I think it goes something like this: > > > - Create a new fileops (f_copy_file_range), since it needs to use > > > the correct range lock variables (in shmfd instead of vnode ones). > > > - Move most of kern_copy_file_range() into vnodeop_copy_file_range() > > > and call f_copy_file_range() from kern_copy_file_range(). > > > - Create a shm_copy_file_range() that does the correct range locking > > > and then copies via uiomove(). > > > This would be a KABI change, so I do not think it could be MFC'd. > > > > > > I think there is a need for copy_file_range(2) to return EOPNOTSUP > > > for cases it will never handle. (I need to test AF_LOCAL sockets, > > > since I think they have vnodes?) > > copy_file_range(2) does currently return EOPNOTSUPP for unix > > domain (AF_LOCAL) sockets. The man page needs to be fixed, > > whether or not support for shared memory objects is added. > > > Oops, my mistake. It was the open(2) that failed with EOPNOTSUPP, > not copy_file_range(2). (I have a simple test program that open(2)s > file names and then uses copy_file_range(2) on the descriptors. > Btw, an open(2) with O_PATH works, but no data is copied. > Not sure if that should be considered correct behaviour? Do you mean that copy_file_range returns 0 for AF_LOCAL sockets? That sounds suspicious. 0 could be interpreted as EoF. Could you please share your test program?