[Bug 272678] VFS: Incorrect data in read from concurrent write

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 23 Jul 2023 13:32:58 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272678

Andrew "RhodiumToad" Gierth <andrew@tao11.riddles.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrew@tao11.riddles.org.uk

--- Comment #1 from Andrew "RhodiumToad" Gierth <andrew@tao11.riddles.org.uk> ---
I analyzed this one based on discussions on IRC with the original poster (and
did the testing on 13.2-stable and 14-current).

The short version is that vn_read thinks it can do the VOP_READ_PGCACHE path
without doing any locking at all, which violates the consistency expected here
(and the OP's test case is taken from an actual application, mariadb). The
problem does not normally manifest with non-tmpfs filesystems because those are
using range locks and the vn_io_fault code path (which tmpfs does not use).

In more detail:

vn_write calls tmpfs_write with the vnode exclusively locked, and on an
appending write, tmpfs_write updates the file length before copying in the new
data.

vn_read calls tmpfs_read_pgcache without any locking, and that reads the file
length (getting the new length). There is a comment here "size cannot become
shorter due to rangelock.", which I think is bogus because there is no
rangelock in effect at this point; further tests would be needed to verify
whether a concurrent truncate could crash it. After getting the length (and in
the race case, that'll be the new length), it copies the data (which is not yet
written in the race case).

-- 
You are receiving this mail because:
You are the assignee for the bug.