Re: aio_read2() and aio_write2()

From: Vinícius_dos_Santos_Oliveira <vini.ipsmaker_at_gmail.com>
Date: Thu, 01 Feb 2024 17:46:10 UTC
Em qua., 31 de jan. de 2024 às 15:19, Alan Somers
<asomers@freebsd.org> escreveu:
> Oh, are you not actually concerned about real files?  aio_read and
> aio_write already have special handling for sockets.

There are at least two projects that depend on this patch (the ones
that I'm directly involved with):

* Add a FreeBSD port for libboost's ASIO. This is just a library so I
cannot speak (much) for application developers making use of libboost.
However libboost has a design that is very clear (basically it exposes
proactors such as what you'd find in Linux's io_uring or Windows'
IOCP).

* A runtime that I created for Lua developers. It makes use of
libboost's ASIO. The primary use is files. The runtime needs special
handling for sandboxes (another feature that it offers), and currently
FreeBSD has no solutions for the problems that are addressed by
Konstantin's patch. I mentioned the specifics during our conversation
already. You may look at the project's documentation if you need an
introduction to the project:
https://docs.emilua.org/api/0.6/tutorial/sandboxes.html

> The only sense in which FreeBSD is "special" is that we're better at
> finding the best solutions, rather than the quickest and hackiest.
> That's why we have kqueue instead of epoll, and ifconfig instead of
> ifconfig/iwconfig/wpa_supplicant/ip .

I'm not comparing against epoll nor Linux's ifconfig. Windows IOCP is
old. We had a lot of time to understand which flaws are Windows faults
and which flaws are IOCP design's faults. Windows hasn't been the only
proactor for async IO. We accumulated experience for proactors. POSIX
AIO combined with kqueue implements a proactor, so there's experience
even within FreeBSD. Linux's io_uring is just yet another instance of
proactors in the wild.

Solutions shouldn't be rushed, but even if we only review at most a
line per day of the mentioned patch, we've already gone past the
minimum wait time. I don't care if we even spend 10 times more
reviewing it, but for a patch this simple, I'd like to see something
more than vague requirements that cannot be met. The patch isn't
polemic at all (POSIX AIO is useless by itself, and has been extended
before... many times... with no one complaining). What specifically do
you have against a patch that would solve my problem? The patch can be
changed, but a vague review is not helping anyone. Meanwhile I cannot
resume my experiments on FreeBSD sandboxing to address real-world
problems. Orchestrating fixes across a range of OSS projects that
interact with each other take years. Blocking a patch this small and
with clear semantics will just escalate for more years to coordinate
the remaining OSS projects to adapt. If I had received the same
treatment for every patch that I ever contributed, I wouldn't be able
to see the result of my contributions in my own lifetime. This is not
fruitful collaboration.

I'm not an amateur at concurrency nor async IO.

* I wrote the initial patch that fixes a bug in LLVM libc++'s
condition_variable: https://reviews.llvm.org/D105758
* I fixed an event race that was present in GLib for years:
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1960
* I wrote the first sucessful integration between GLib and Boost.Asio
(it couldn't be written before I fixed the bug mentioned just above)
* I fixed a security bug in Linux namespace tooling that would allow
any user to overwrite any root-owned file:
https://github.com/shadow-maint/shadow/issues/635
* I identified and fixed many bugs in Boost.Asio (a few related to
running Boost.Asio on FreeBSD). A few of them are still pending for
inclusion, but they'll hit the upstream repo eventually.
* I successfully developed a runtime that exposes fiber concurrency
for orchestrating async IO within an actor while deploying actor
concurrency for exploiting scalable parallelism, and also uses the
shared-nothing paradigm of the actor model for practical
capability-based sandboxing. I'm not aware of any other system doing
this.
* I wrote patches fixing real-world issues for many other OSS projects
while doing my research.

I have a problem that is addressed by kib's patch. I'd like to see it
solved. For a patch that is not intrusive at all (it doesn't even add
a syscall), it's based on previous practice (POSIX AIO has been
extended before with no one complaining, and we're doing just that...
extending it once again), it's this small (very very few lines and
pretty much safe to apply), and has well-understood semantics (most of
the behaviour was there already), I'd like to see feedback that has
concrete points on where the patch should change.

The initial interaction was requirement-gauging which is unavoidable.
There have been useful exchanges as well, but at some point the review
derailed. I'm not seeing any concrete points on what should change for
acceptance. And out of nowhere a competitor to POSIX AIO that I do not
want to design has been suggested. Be free to design it, but don't
block POSIX AIO patches while developing your new subsystem at your
own pace. In the meantime there's an existing problem that could be
fixed (and no viable concrete alternative that would fix the problem
that I'm facing has been proposed).

> I would like to see a design that:
> * Is extensible to most file system and networking syscalls, even if
> it doesn't include them right now.  At a minimum, it should be able to
> include fspacectl, copy_file_range, truncate, and posix_fallocate.
> Probably open too.

That's not POSIX AIO. Any POSIX AIO extension will be rejected by your
criterias. The current POSIX AIO is already rejected by your
criterias. Should we remove POSIX AIO then?

POSIX AIO is practically useless without extensions. It's only useful
in the BSD world where it has an extension for kqueue integration.
FreeBSD even has other extensions besides kqueue integration (e.g.
aio_readv()).

LIO_FOFFSET won't prevent you from developing and proposing a new
FreeBSD async IO API that competes with POSIX AIO. You can design your
POSIX AIO competitor at your own pace with no rush. In the meantime, I
have a problem to be fixed.

> * Is reviewed by kib and Thomas Munro.

We can get to Thomas Munro once we solve your own requirements. Can
you elaborate requirements that are actually possible to meet? A patch
for POSIX AIO must meet the POSIX AIO mindset. You're asking for a new
subsystem that has nothing to do with POSIX AIO.

> * Has completion notification delivered by kqueue.

Okay.

> * Is race-resistant.

Race-resistant? Filesystem is a global resource. You're specifically
asking for a solution that cannot be developed (and all existing APIs
already violate). Can you be more specific?

> [...] That's what a good asynchronous API looks like

io_uring is a good async API. Among other things, it offers read()
using current file offset. What part of io_uring became "bad" because
it allows skipping an explicit offset?

And you're too vague when you talk about "races". Again: filesystem is
a global resource. Even if your process is not creating races, the
interaction between different processes might create races, and
there's nothing to do here.






--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/