From nobody Fri Feb 04 23:31:43 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id BA4A319AE971; Fri, 4 Feb 2022 23:31:44 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JrBdD2tQNz4gN0; Fri, 4 Feb 2022 23:31:44 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1644017504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FavgxZu+CR0NwT/NRqD9e51QFonkjHa58l06AMHElQY=; b=j9fDarb2eSAl9Rky/HbGjaxeiuwa5YQls+9aVYfPzW8EBIAQvRzbKdeXA77aMJCGirwk7u E/gqOibASfdRelJiMW23OXNAh3fS/OTz3bUuqUyLzjwNpOlOLGG268V1PFzQQKY7sruVEi tKB9x0CIpiWIs0X6DduOauOBwUOU5XcACYjuenJRPrElPiPw6b7Rirzu92k1AYs0+aK/3r pjKN+zmV5/cY9EpqwPSlU8FfZNrSOfZCVh7FZfsUrOZfB+MaUsmKj1VRhMsqwI8KJDVETG YSM85ZQqtpZ8uUte1yEE5By6mZgGjCID4CLt/gm6ax+7rOvCvNtGnJ25rAUaGg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 10FA614E30; Fri, 4 Feb 2022 23:31:44 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 214NVhNB082711; Fri, 4 Feb 2022 23:31:43 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 214NVhg4082710; Fri, 4 Feb 2022 23:31:43 GMT (envelope-from git) Date: Fri, 4 Feb 2022 23:31:43 GMT Message-Id: <202202042331.214NVhg4082710@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Alan Somers Subject: git: 00134a07898f - main - fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: asomers X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 00134a07898fa807b8a1fcb2596f0e3644143f69 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1644017504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FavgxZu+CR0NwT/NRqD9e51QFonkjHa58l06AMHElQY=; b=mntNm/GUSm5Z3l+IiT6o/vDO9Qh7WhoJRZdJdJ2vU0NBUZVsEbl+4UKCcqiVqFDAlbob35 L56vIE+hiXPQFLOuqc3qflMJWUdMuqPe7gz1ZxDAQbKGn7t1wx4krK+wYffVN8Q99RLK5o +KNoj6lCXvPC+el7GHcs+JFo39IQkkx+ILIbX0ciollVxBy9CjagXgTNeqWH4c7dJL2+pg kNUpRz46XhWb0VaV1mKbLFdnKzvQYvLDAnsEGGezV0Jj/KMOsuYJup94d2FUm0QJO/ji+w LQSE2IXMHOH68zGrsYPOHoO/DDZ4qYjOBH20s4cEtjXoBK4x9g9sBx6rkFf7cQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1644017504; a=rsa-sha256; cv=none; b=Iu42I7f747cJcUzaw/Pw2yW9klRIf8fIaAFA+1h/3KkzUL/GZDcfucVtrx77bCIjAqaLyV v49czKlqA7MkLdQnwe6WJHrGKaBNeVShcXq0R2R1AJ0JMdO6H2VYDGu+tt9GKt0f7NaVjL hUmUuImquZofJ/Anhq0iks441Mxmq4CRIjRU0SpaQ7wcbGiXv7NQ/wIX7qMud558MDgVQq 6BBWkZNR0xo6wgjEO2A7Mnf4TOF1wD1XnppDbnMfKM6PDYprhcqNckpnt/R/VBO1Y3PWdb Vndz+ZRwOnLbVol+wJ8oiqhhsc2Z+z/cznhZCrd3iycWUDgow/4u8c4y9CHsAA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by asomers: URL: https://cgit.FreeBSD.org/src/commit/?id=00134a07898fa807b8a1fcb2596f0e3644143f69 commit 00134a07898fa807b8a1fcb2596f0e3644143f69 Author: Alan Somers AuthorDate: 2022-01-03 00:16:09 +0000 Commit: Alan Somers CommitDate: 2022-02-04 23:31:05 +0000 fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not guarantee that d_off will be valid after closing and reopening a directory. That conflicts with NFS's statelessness, that results in unresolvable bugs when NFS reads large directories, if: * The file system _does_ change the d_off field for the last directory entry previously returned by VOP_READDIR, or * The file system deletes the last directory entry previously seen by NFS. Rather than doing a poor job of exporting such file systems, it's better just to refuse. Even though this is technically a breaking change, 13.0-RELEASE's NFS-FUSE support was bad enough that an MFC should be allowed. MFC after: 3 weeks. Reviewed by: rmacklem Differential Revision: https://reviews.freebsd.org/D33726 --- sys/fs/fuse/fuse_internal.c | 84 +++++++++++++++------------------------------ sys/fs/fuse/fuse_internal.h | 8 ++--- sys/fs/fuse/fuse_vnops.c | 50 +++++++++++++++++---------- 3 files changed, 64 insertions(+), 78 deletions(-) diff --git a/sys/fs/fuse/fuse_internal.c b/sys/fs/fuse/fuse_internal.c index eb8f1f87d90f..8f5cbb2d86ae 100644 --- a/sys/fs/fuse/fuse_internal.c +++ b/sys/fs/fuse/fuse_internal.c @@ -553,7 +553,6 @@ fuse_internal_mknod(struct vnode *dvp, struct vnode **vpp, int fuse_internal_readdir(struct vnode *vp, struct uio *uio, - off_t startoff, struct fuse_filehandle *fufh, struct fuse_iov *cookediov, int *ncookies, @@ -562,7 +561,6 @@ fuse_internal_readdir(struct vnode *vp, int err = 0; struct fuse_dispatcher fdi; struct fuse_read_in *fri = NULL; - int fnd_start; if (uio_resid(uio) == 0) return 0; @@ -572,18 +570,6 @@ fuse_internal_readdir(struct vnode *vp, * Note that we DO NOT have a UIO_SYSSPACE here (so no need for p2p * I/O). */ - - /* - * fnd_start is set non-zero once the offset in the directory gets - * to the startoff. This is done because directories must be read - * from the beginning (offset == 0) when fuse_vnop_readdir() needs - * to do an open of the directory. - * If it is not set non-zero here, it will be set non-zero in - * fuse_internal_readdir_processdata() when uio_offset == startoff. - */ - fnd_start = 0; - if (uio->uio_offset == startoff) - fnd_start = 1; while (uio_resid(uio) > 0) { fdi.iosize = sizeof(*fri); fdisp_make_vp(&fdi, FUSE_READDIR, vp, NULL, NULL); @@ -595,9 +581,8 @@ fuse_internal_readdir(struct vnode *vp, if ((err = fdisp_wait_answ(&fdi))) break; - if ((err = fuse_internal_readdir_processdata(uio, startoff, - &fnd_start, fri->size, fdi.answ, fdi.iosize, cookediov, - ncookies, &cookies))) + if ((err = fuse_internal_readdir_processdata(uio, fri->size, + fdi.answ, fdi.iosize, cookediov, ncookies, &cookies))) break; } @@ -612,8 +597,6 @@ fuse_internal_readdir(struct vnode *vp, */ int fuse_internal_readdir_processdata(struct uio *uio, - off_t startoff, - int *fnd_start, size_t reqsize, void *buf, size_t bufsize, @@ -624,8 +607,6 @@ fuse_internal_readdir_processdata(struct uio *uio, int err = 0; int oreclen; size_t freclen; - int ents_copied = 0; - int ents_seen = 0; struct dirent *de; struct fuse_dirent *fudge; @@ -636,7 +617,7 @@ fuse_internal_readdir_processdata(struct uio *uio, return -1; for (;;) { if (bufsize < FUSE_NAME_OFFSET) { - err = (ents_seen == 0 || ents_copied > 0) ? -1 : 0; + err = -1; break; } fudge = (struct fuse_dirent *)buf; @@ -647,7 +628,7 @@ fuse_internal_readdir_processdata(struct uio *uio, * This indicates a partial directory entry at the * end of the directory data. */ - err = (ents_seen == 0 || ents_copied > 0) ? -1 : 0; + err = -1; break; } #ifdef ZERO_PAD_INCOMPLETE_BUFS @@ -669,41 +650,32 @@ fuse_internal_readdir_processdata(struct uio *uio, err = -1; break; } - ents_seen++; - /* - * Don't start to copy the directory entries out until - * the requested offset in the directory is found. - */ - if (*fnd_start != 0) { - fiov_adjust(cookediov, oreclen); - bzero(cookediov->base, oreclen); - - de = (struct dirent *)cookediov->base; - de->d_fileno = fudge->ino; - de->d_off = fudge->off; - de->d_reclen = oreclen; - de->d_type = fudge->type; - de->d_namlen = fudge->namelen; - memcpy((char *)cookediov->base + sizeof(struct dirent) - - MAXNAMLEN - 1, - (char *)buf + FUSE_NAME_OFFSET, fudge->namelen); - dirent_terminate(de); - - err = uiomove(cookediov->base, cookediov->len, uio); - if (err) + fiov_adjust(cookediov, oreclen); + bzero(cookediov->base, oreclen); + + de = (struct dirent *)cookediov->base; + de->d_fileno = fudge->ino; + de->d_off = fudge->off; + de->d_reclen = oreclen; + de->d_type = fudge->type; + de->d_namlen = fudge->namelen; + memcpy((char *)cookediov->base + sizeof(struct dirent) - + MAXNAMLEN - 1, + (char *)buf + FUSE_NAME_OFFSET, fudge->namelen); + dirent_terminate(de); + + err = uiomove(cookediov->base, cookediov->len, uio); + if (err) + break; + if (cookies != NULL) { + if (*ncookies == 0) { + err = -1; break; - if (cookies != NULL) { - if (*ncookies == 0) { - err = -1; - break; - } - *cookies = fudge->off; - cookies++; - (*ncookies)--; } - ents_copied++; - } else if (startoff == fudge->off) - *fnd_start = 1; + *cookies = fudge->off; + cookies++; + (*ncookies)--; + } buf = (char *)buf + freclen; bufsize -= freclen; uio_setoffset(uio, fudge->off); diff --git a/sys/fs/fuse/fuse_internal.h b/sys/fs/fuse/fuse_internal.h index c17eff2acac3..5d852b420366 100644 --- a/sys/fs/fuse/fuse_internal.h +++ b/sys/fs/fuse/fuse_internal.h @@ -250,12 +250,12 @@ int fuse_internal_mknod(struct vnode *dvp, struct vnode **vpp, struct pseudo_dirent { uint32_t d_namlen; }; -int fuse_internal_readdir(struct vnode *vp, struct uio *uio, off_t startoff, +int fuse_internal_readdir(struct vnode *vp, struct uio *uio, struct fuse_filehandle *fufh, struct fuse_iov *cookediov, int *ncookies, uint64_t *cookies); -int fuse_internal_readdir_processdata(struct uio *uio, off_t startoff, - int *fnd_start, size_t reqsize, void *buf, size_t bufsize, - struct fuse_iov *cookediov, int *ncookies, uint64_t **cookiesp); +int fuse_internal_readdir_processdata(struct uio *uio, size_t reqsize, + void *buf, size_t bufsize, struct fuse_iov *cookediov, int *ncookies, + uint64_t **cookiesp); /* remove */ diff --git a/sys/fs/fuse/fuse_vnops.c b/sys/fs/fuse/fuse_vnops.c index da925b5dcbb5..10d64390123d 100644 --- a/sys/fs/fuse/fuse_vnops.c +++ b/sys/fs/fuse/fuse_vnops.c @@ -1846,11 +1846,10 @@ fuse_vnop_readdir(struct vop_readdir_args *ap) struct uio *uio = ap->a_uio; struct ucred *cred = ap->a_cred; struct fuse_filehandle *fufh = NULL; - struct fuse_data *mpdata = fuse_get_mpdata(vnode_mount(vp)); + struct mount *mp = vnode_mount(vp); struct fuse_iov cookediov; int err = 0; uint64_t *cookies; - off_t startoff; ssize_t tresid; int ncookies; bool closefufh = false; @@ -1867,25 +1866,18 @@ fuse_vnop_readdir(struct vop_readdir_args *ap) } tresid = uio->uio_resid; - startoff = uio->uio_offset; err = fuse_filehandle_get_dir(vp, &fufh, cred, pid); - if (err == EBADF && vnode_mount(vp)->mnt_flag & MNT_EXPORTED) { + if (err == EBADF && mp->mnt_flag & MNT_EXPORTED) { + struct fuse_data *data = fuse_get_mpdata(mp); + + KASSERT(data->dataflags & FSESS_NO_OPENDIR_SUPPORT, + ("FUSE file systems that don't set " + "FUSE_NO_OPENDIR_SUPPORT should not be exported")); /* * nfsd will do VOP_READDIR without first doing VOP_OPEN. We - * must implicitly open the directory here + * must implicitly open the directory here. */ err = fuse_filehandle_open(vp, FREAD, &fufh, curthread, cred); - if (err == 0 && !(mpdata->dataflags & FSESS_NO_OPEN_SUPPORT)) { - /* - * FUSE does not require a directory entry's d_off - * field to be valid outside of the lifetime of the - * directory's FUSE file handle. So we must read the - * directory from the beginning. However, if the file - * system sets FUSE_NO_OPENDIR_SUPPORT, then the d_off - * field will be valid for the lifetime of the dirent. - */ - uio->uio_offset = 0; - } closefufh = true; } if (err) @@ -1903,7 +1895,7 @@ fuse_vnop_readdir(struct vop_readdir_args *ap) #define DIRCOOKEDSIZE FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + MAXNAMLEN + 1) fiov_init(&cookediov, DIRCOOKEDSIZE); - err = fuse_internal_readdir(vp, uio, startoff, fufh, &cookediov, + err = fuse_internal_readdir(vp, uio, fufh, &cookediov, &ncookies, cookies); fiov_teardown(&cookediov); @@ -3108,8 +3100,30 @@ fuse_vnop_vptofh(struct vop_vptofh_args *ap) struct vattr va; int err; - if (!(data->dataflags & FSESS_EXPORT_SUPPORT)) + if (!(data->dataflags & FSESS_EXPORT_SUPPORT)) { + /* NFS requires lookups for "." and ".." */ + SDT_PROBE2(fusefs, , vnops, trace, 1, + "VOP_VPTOFH without FUSE_EXPORT_SUPPORT"); return EOPNOTSUPP; + } + if ((mp->mnt_flag & MNT_EXPORTED) && + !(data->dataflags & FSESS_NO_OPENDIR_SUPPORT)) + { + /* + * NFS is stateless, so nfsd must reopen a directory on every + * call to VOP_READDIR, passing in the d_off field from the + * final dirent of the previous invocation. But without + * FUSE_NO_OPENDIR_SUPPORT, the FUSE protocol does not + * guarantee that d_off will be valid after a directory is + * closed and reopened. So prohibit exporting FUSE file + * systems that don't set that flag. + * + * But userspace NFS servers don't have this problem. + */ + SDT_PROBE2(fusefs, , vnops, trace, 1, + "VOP_VPTOFH without FUSE_NO_OPENDIR_SUPPORT"); + return EOPNOTSUPP; + } err = fuse_internal_getattr(vp, &va, curthread->td_ucred, curthread); if (err)