[Bug 233245] [UFS] Softupdates fails to track dependency between appended data and i_size

Fri Nov 16 02:58:14 UTC 2018

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233245

--- Comment #2 from Conrad Meyer <cem at freebsd.org> ---
Hi Kirk,

Thanks for your prompt reply.

(In reply to Kirk McKusick from comment #1)
> As you note, soft updates does not currently consider it necessary to ensure
> that the new contents of the block be written before increasing the file size
> if it knows that the exposed contents will be zero (as opposed to the random
> data that was in a previously used block where it does ensure that the block is
> written before it can be accessed).

Yep, understood.

> It would be possible to add a requirement that the new data be written before
> the size could be updated,

That is the proposal. :-)

> but it is not clear to me that adding this extra overhead is worthwhile.

Overhead in which sense?  I can imagine a few objections (it makes the code
more complicated; there is some minor additional memory burden; SU dependency
graphs get a little deeper) but maybe you have others in mind I hadn't thought
of.  I don't think there is any additional IO cost (perhaps longer fsync time
on appended files due to the additional ordering barrier?), but I might well be
missing something.

As far as whether any runtime overhead is worthwhile, I think there are two
relevant dimensions.  One, what is the measurable overhead, if any?  (We can't
really answer this one until we have a proof of concept implementation.)  And
two, what is the user willing to accept, both in terms of overhead and append
data fidelity?  I think it's likely a trade-off we might not want to make
unilaterally, but instead leave to the end user.

It should be pretty easy to add it as a mount option or something like that; in
the same vein as "noatime" allows users to disable a feature with too much
overhead.  I don't see any significant barriers to enabling or disabling it
on-the-fly for existing mounts at runtime, either.

> Also, this case includes overwriting data in the middle of an earlier block in the file for which there is nothing that we can do.

Understood — append is /the/ special case of write where this is possible.  But
it is also a fairly common case, and given that we can do it safely, I think we
should aim to do it safely.

(For middle-of-file partial overwrites, we could order data block flushing with
inode mtime update, if we do not already.  That would be a pretty minimal
protection and would not save us from torn writes.  I think we might also be
able to allocate data blocks out-of-place and order full block writes with
their corresponding block pointer or indirect block pointer updates, but that
has significant downsides, such as file fragmentation, and isn't something I'm
interested in right now.)

> Note that you cannot put the test into ffs_write() in the way that you have
> done (which makes the call for any growth in size). It can only be done when
> the growth is such that it does not go out of an existing block allocation or
> is overwriting an earlier part of the file.

The elided portion of ffs_write() between lines 756 and 781 invokes
UFS_BALLOC() to allocate any blocks needed -- so I believe the suggested
location in ffs_write() would be within an existing block allocation.  But
maybe I am mistaken.  I am not attached to the location, it just seemed like a
plausible spot to me.

(There is a smaller separate concern, which is that any full block append is
already fully tracked with allocdirect or allocindirect — we only want the
proposed partial append dependency for the last iteration of the loop in
ffs_write.)

-- 
You are receiving this mail because:
You are the assignee for the bug.