[Bug 269330] fusefs: data corruption with mmap and either o_direct or fspacectl

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 04 Feb 2023 22:30:49 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269330

            Bug ID: 269330
           Summary: fusefs: data corruption with mmap and either o_direct
                    or fspacectl
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

Similar to 269261, it's possible to trigger data corruption in fusefs by
writing via mmap and then doing directish I/O.  The general pattern is this:

1) A write via mmap creates some dirty pages but doesn't flush them
2) Either a write with O_DIRECT or an fspacectl operation overlaps with the
region from step 1.  fspacectl operates similarly to writes with O_DIRECT in
this regard, because both bypass the cache and go directly to the fuse daemon.
2a) fusefs invalidates the cache for all pages in that region, including pages
that are only partially written or deallocated.  It checks for pages that are
dirty, but the B_CACHE bit apparently isn't set for pages dirtied via mmap
3) A subsequent read operation reads zeros from the partially deallocated page,
including regions of that page that shouldn't have been deallocated.

Questions:
1) Why isn't B_CACHE set in pages that were dirtied by mmap?
2) Is there a better way to find dirty pages?
3) If I replace fuse_inval_buf_range with vnode_pager_purge_range, similarly to
what's done in zfs_free_range, it fixes the interaction of mmap with fspacectl.
 However, vnode_pager_purge_range apparently fails to invalidate entire pages
that were cached by normal (non-mmap) reads.  Why is that?

-- 
You are receiving this mail because:
You are the assignee for the bug.