posix_fallocate on ZFS
John Baldwin
jhb at freebsd.org
Mon Feb 12 18:47:21 UTC 2018
On Saturday, February 10, 2018 01:46:33 PM Garrett Wollman wrote:
> In article
> <CAOtMX2jZr_kvJgOZWeiB-AZ3-7-uUu+UQ3P0nKhGZ0eNRzwMOQ at mail.gmail.com>,
> asomers at freebsd.org writes:
>
> >On Sat, Feb 10, 2018 at 10:28 AM, Willem Jan Withagen <wjw at digiware.nl>
> >wrote:
>
> >> Is there any expectation that this is going to fixed in any near future?
>
> >No. It's fundamentally impossible to support posix_fallocate on a COW
> >filesystem like ZFS. Ceph should be taught to ignore an EINVAL result,
> >since the system call is merely advisory.
>
> I don't think it's true that this is _fundamentally_ impossible. What
> the standard requires would in essence be a per-object refreservation.
> ZFS supports refreservation, obviously, but not on a per-object basis.
> Furthermore, there are mechanisms to preallocate blocks for things
> like dumps. So it *could* be done (as in, the concept is there), but
> it may not be practical. (And ultimately, there are ways in which the
> administrator might manage the system that would defeat the desired
> effect, but that's out of the standard's scope.) Given the semantic
> mismatch, though, I suspect it's unreasonable to expect anyone to
> prioritize implementation of such a feature.
I don't think posix_fallocate() can be compatible with COW. Suppose you
do reserve a fixed set of blocks. That ensures the first write has a
place to write, but not if you overwrite one of those blocks. You'd have
to reserve another block to maintain the reservation each time you wrote
to a block, or you'd have to have a way to mark a file as not COW. The
first case isn't really any better than not using posix_fallocate() in the
first place as you are still requiring writes to allocate blocks, and the
second seems a bit fraught with peril as well if the application is
expecting the non-COW'd file to be in sync with other files in the system
since presumably non-COW'd files couldn't be snapshotted, etc.
--
John Baldwin
More information about the freebsd-current
mailing list