From nobody Thu Sep 21 00:30:57 2023 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Rrbt562swz4vNsl for ; Thu, 21 Sep 2023 00:31:09 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Rrbt54FJ5z4jLt; Thu, 21 Sep 2023 00:31:09 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-578d791dd91so227437a12.0; Wed, 20 Sep 2023 17:31:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695256268; x=1695861068; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wcf0Pf8PaTT4Kzz7BBkD3FrhOrdpzZtrgYiqI7hxFrI=; b=OdzAHJlN04hTfVd1uGb4/zOkReYgoUAxKlGTFf0LQAsuRXG8TGHN/7wl2yNFfVdTIz obu0yqeZqL7UwZB1u78RgYLscNXw/gs0rZdAWp2Wmw/SwTzFC282/a03jLi4PYfUgcXw 9cyRWqdPyXFztFnYobSxUPyF5+5IPTu/q3mvVvdLfh8JDb+tDJNdKZQHr/PZkWFWINw+ 2jC1dQEt8kGS/1U8d1yQzuqz/4yWVrhv0MXO3zcFP4W3BXfd/rpnjRtwW1emdD9PiDfW xZ//MVI94iQKsa/tOqbTNc7BAjQZ8CTvB2cNp/tXv+zEI+ASanwvolVAgni1lMrtpCq5 FZuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695256268; x=1695861068; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wcf0Pf8PaTT4Kzz7BBkD3FrhOrdpzZtrgYiqI7hxFrI=; b=vjdFZUJwaUkLbTxFIIFLO0FIHWWkqUXiBqAlW2vuuBOXkbbaMgfmYn7VqbJOMTvOD/ lMwcqEOQ2cFSLY2lNEWdEuPOV+In6Xff73J0eA0zsM8o1lYEIVkJlYuxJkNfc3Pn9an3 rXSVlvwvTv9L9tu29vmi1WDdjQRyGVDtBjxWM56OGjP2+v7nZslFsvYk0pqBJdhsmcBW RBUVYud6Wg+84u88Ei6+5rCJBsfyqkfqvPwzM8LP6BsjNsE/p7bLBRz7Kw60Sfyaj4G4 gAWXeboETosSvuC/Xsm7P6DTLam0RYSDT+q2ie96VqYvYnqHDOMuC2i/zpLPequ5V8wc 8vEg== X-Gm-Message-State: AOJu0YzVmw95Vii7s85xaOsdB5a6YjMtr71zPA8uKkNYrq5Ijz71v4JS geqz8FBi386TcEGbjsF2JDXJEY/iyTZCHRqybxz9Eok= X-Google-Smtp-Source: AGHT+IFx2SJE71gJs1hxWPChDJlP476d8ulZAt8rj8pR3WANS0kvV3KZbW+bePH0YjiH09djMpsreYjoXtKUB2nfnTw= X-Received: by 2002:a17:90a:34c3:b0:263:f521:da3e with SMTP id m3-20020a17090a34c300b00263f521da3emr5796030pjf.2.1695256268169; Wed, 20 Sep 2023 17:31:08 -0700 (PDT) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Wed, 20 Sep 2023 17:30:57 -0700 Message-ID: Subject: Re: RFC: Should copy_file_range(2) work for shared memory objects? To: Alan Somers Cc: Freebsd fs Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4Rrbt54FJ5z4jLt On Wed, Sep 20, 2023 at 4:52=E2=80=AFPM Alan Somers w= rote: > > On Wed, Sep 20, 2023 at 4:47=E2=80=AFPM Rick Macklem wrote: > > > > On Wed, Sep 20, 2023 at 4:21=E2=80=AFPM Rick Macklem wrote: > > > > > > On Wed, Sep 20, 2023 at 4:09=E2=80=AFPM Rick Macklem wrote: > > > > > > > > On Wed, Sep 20, 2023 at 3:07=E2=80=AFPM Alan Somers wrote: > > > > > > > > > > On Wed, Sep 20, 2023 at 3:05=E2=80=AFPM Rick Macklem wrote: > > > > > > > > > > > > Right now (as noted by PR#273962) copy_file_range(2) > > > > > > fails for shared memory objects because there is no > > > > > > vnode (f_vnode =3D=3D NULL) for them and the code uses > > > > > > vnodes (including a file system specific VOP_COPY_FILE_RANGE(9)= ). > > > > > > > > > > > > Do you think copy_file_range(2) should work for shared memory o= bjects? > > > > > > > > > > > > This would require specific handling in kern_copy_file_range() > > > > > > to work. I do not think the patch would be a lot of work, but > > > > > > I am not familiar with the f_ops and shared memory code. > > > > > > > > > > > > rick > > > > > > > > > > This sounds annoying to fix. But I think we ought to. Right now > > > > > programmers can assume that copy_file_range will work for every t= ype > > > > > of file. We don't document an EOPNOTSUP error code or anything l= ike > > > > > that. Does it work on sockets, too? > > > > No. I guess I have a different definition of "file" (unless you mea= nt > > > > "filedesc"?). I cannot see how a "range is defined for sockets > > > > or named pipes or...". It currently checks for a f_vnode, which > > > > probably is not enough. (I haven't figured out what path_fileops > > > > are, so I do not know if they work?) > > > > > > > > I can see how it can be implemented for shared memory objects. > > > > However, this is going to take a fair amount of work, since they > > > > do not use vnodes. > > > > I think it goes something like this: > > > > - Create a new fileops (f_copy_file_range), since it needs to use > > > > the correct range lock variables (in shmfd instead of vnode ones)= . > > > > - Move most of kern_copy_file_range() into vnodeop_copy_file_range(= ) > > > > and call f_copy_file_range() from kern_copy_file_range(). > > > > - Create a shm_copy_file_range() that does the correct range lockin= g > > > > and then copies via uiomove(). > > > > This would be a KABI change, so I do not think it could be MFC'd. > > > > > > > > I think there is a need for copy_file_range(2) to return EOPNOTSUP > > > > for cases it will never handle. (I need to test AF_LOCAL sockets, > > > > since I think they have vnodes?) > > > copy_file_range(2) does currently return EOPNOTSUPP for unix > > > domain (AF_LOCAL) sockets. The man page needs to be fixed, > > > whether or not support for shared memory objects is added. > > > > > Oops, my mistake. It was the open(2) that failed with EOPNOTSUPP, > > not copy_file_range(2). (I have a simple test program that open(2)s > > file names and then uses copy_file_range(2) on the descriptors. > > Btw, an open(2) with O_PATH works, but no data is copied. > > Not sure if that should be considered correct behaviour? > > Do you mean that copy_file_range returns 0 for AF_LOCAL sockets? That > sounds suspicious. 0 could be interpreted as EoF. Could you please > share your test program? No, it meant that copy_file-range(2) never got called. Btw, if copy_file_range(2) starts returning anything other than EINVAL for inappropriate file descriptors, cp(1) and maybe cat(1) will need to be fixed. (cp checks for EINVAL and cat checks for EINVAL and EBADF, to decide if a non-copy_file_range(2) should be done when copy_file-range(2= ) fails.) rick