From nobody Thu Feb 01 17:46:10 2024 X-Original-To: freebsd-threads@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TQmYF6fqMz58KJR for ; Thu, 1 Feb 2024 17:46:25 +0000 (UTC) (envelope-from vini.ipsmaker@gmail.com) Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TQmYF3xpPz46KK; Thu, 1 Feb 2024 17:46:25 +0000 (UTC) (envelope-from vini.ipsmaker@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-51120e2864fso1365045e87.1; Thu, 01 Feb 2024 09:46:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706809583; x=1707414383; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tKzxQi7iIZk0d7O7ObRtmAdSVHj3xTWQe3dh+BSU7Vs=; b=A8Kjpbok+Cz306rg5meoAwKMbRQ+lT0Sl3dGn+hCOWdbEJudTcyWGmvJ/IbWxPxrTC aWgK5EhcIzQ1JTdeyNJowbVQyDFRCJ4zs120IP5maDHpJfXxs6RO2M9uOm/YZkX9uxNv fop8n5BUJQY2ughf6Tq/1qOzG5IbIYlpOwjZgfHZ0307+/7M5lnXoO+l7lrc9ik/eMnE rdE1rCbnQefL+vDYZ0LWuFPv+j13B1cY0zlc601Onimrm9LgBHUN2gWQ2G7Cg0ctxqZA xbqVu4U1H7nbcUQIPjDvBtB6wuBQ+DPkxE4X5/ekw+MnKUX8/vWMiSocTHofbEY6pg6g DnZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706809583; x=1707414383; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tKzxQi7iIZk0d7O7ObRtmAdSVHj3xTWQe3dh+BSU7Vs=; b=awp/BTC76eezLgiBZniFQ7+FXYnlBqQqnP7aOVlYGh8IBTwCKEbkw8Bsfzy3Hofvak 4oMugFvkeZ+xD0fz+3muT6PS7u4DJYlHGj4TuGeZzSdKn8GyjUIYbYMHwNBoEUETBg1l pd+8IRjn+g4sihmWMkZEuf+y5erOhyH9WgSQh0xlryqp7CyKvfe+Iu8Qur2cVXtMvImA nVqA4EeAABVjgfpuEzV0sVp/suOGcBmn/YIwdD0+lysYs4fAoTcdplXi19ST2V0ITAGk hZvjEMmHILp9LrxQBme29hzX+fOJLjcXhv3eF6pwW1TMrE5Z4NqQ8Oiph8I98QcxJl3R AeUA== X-Gm-Message-State: AOJu0YxMgHQtd3FYBIYARkGrqesK6dAXKwC1TArkKYB/TWiGuwn+R5jk cNmnOUTjMxS9TMeiVurtS5vaHa4di9eQobdYXUQq8nZ8kOD8zIj6a/i7yadymAcaLk3K4BvF7U9 8ElqtQxsNn4GakgnzIAP6dNZg1c7wMP/Q3Gbkaw== X-Google-Smtp-Source: AGHT+IFOK4Hvl+zbrMGZed6jVMPHgVr55/jgcReyzJ2RQmjAPC7/9ADqNTYO+xFvzNjX0C8NUielcIG58mhkBd+J71c= X-Received: by 2002:ac2:4a6e:0:b0:510:271b:1c1b with SMTP id q14-20020ac24a6e000000b00510271b1c1bmr2273256lfp.33.1706809582965; Thu, 01 Feb 2024 09:46:22 -0800 (PST) List-Id: Threading List-Archive: https://lists.freebsd.org/archives/freebsd-threads List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-threads@freebsd.org X-BeenThere: freebsd-threads@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Vin=C3=ADcius_dos_Santos_Oliveira?= Date: Thu, 1 Feb 2024 14:46:10 -0300 Message-ID: Subject: Re: aio_read2() and aio_write2() To: Alan Somers Cc: Konstantin Belousov , freebsd-threads@freebsd.org, Konstantin Belousov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TQmYF3xpPz46KK X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] Em qua., 31 de jan. de 2024 =C3=A0s 15:19, Alan Somers escreveu: > Oh, are you not actually concerned about real files? aio_read and > aio_write already have special handling for sockets. There are at least two projects that depend on this patch (the ones that I'm directly involved with): * Add a FreeBSD port for libboost's ASIO. This is just a library so I cannot speak (much) for application developers making use of libboost. However libboost has a design that is very clear (basically it exposes proactors such as what you'd find in Linux's io_uring or Windows' IOCP). * A runtime that I created for Lua developers. It makes use of libboost's ASIO. The primary use is files. The runtime needs special handling for sandboxes (another feature that it offers), and currently FreeBSD has no solutions for the problems that are addressed by Konstantin's patch. I mentioned the specifics during our conversation already. You may look at the project's documentation if you need an introduction to the project: https://docs.emilua.org/api/0.6/tutorial/sandboxes.html > The only sense in which FreeBSD is "special" is that we're better at > finding the best solutions, rather than the quickest and hackiest. > That's why we have kqueue instead of epoll, and ifconfig instead of > ifconfig/iwconfig/wpa_supplicant/ip . I'm not comparing against epoll nor Linux's ifconfig. Windows IOCP is old. We had a lot of time to understand which flaws are Windows faults and which flaws are IOCP design's faults. Windows hasn't been the only proactor for async IO. We accumulated experience for proactors. POSIX AIO combined with kqueue implements a proactor, so there's experience even within FreeBSD. Linux's io_uring is just yet another instance of proactors in the wild. Solutions shouldn't be rushed, but even if we only review at most a line per day of the mentioned patch, we've already gone past the minimum wait time. I don't care if we even spend 10 times more reviewing it, but for a patch this simple, I'd like to see something more than vague requirements that cannot be met. The patch isn't polemic at all (POSIX AIO is useless by itself, and has been extended before... many times... with no one complaining). What specifically do you have against a patch that would solve my problem? The patch can be changed, but a vague review is not helping anyone. Meanwhile I cannot resume my experiments on FreeBSD sandboxing to address real-world problems. Orchestrating fixes across a range of OSS projects that interact with each other take years. Blocking a patch this small and with clear semantics will just escalate for more years to coordinate the remaining OSS projects to adapt. If I had received the same treatment for every patch that I ever contributed, I wouldn't be able to see the result of my contributions in my own lifetime. This is not fruitful collaboration. I'm not an amateur at concurrency nor async IO. * I wrote the initial patch that fixes a bug in LLVM libc++'s condition_variable: https://reviews.llvm.org/D105758 * I fixed an event race that was present in GLib for years: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1960 * I wrote the first sucessful integration between GLib and Boost.Asio (it couldn't be written before I fixed the bug mentioned just above) * I fixed a security bug in Linux namespace tooling that would allow any user to overwrite any root-owned file: https://github.com/shadow-maint/shadow/issues/635 * I identified and fixed many bugs in Boost.Asio (a few related to running Boost.Asio on FreeBSD). A few of them are still pending for inclusion, but they'll hit the upstream repo eventually. * I successfully developed a runtime that exposes fiber concurrency for orchestrating async IO within an actor while deploying actor concurrency for exploiting scalable parallelism, and also uses the shared-nothing paradigm of the actor model for practical capability-based sandboxing. I'm not aware of any other system doing this. * I wrote patches fixing real-world issues for many other OSS projects while doing my research. I have a problem that is addressed by kib's patch. I'd like to see it solved. For a patch that is not intrusive at all (it doesn't even add a syscall), it's based on previous practice (POSIX AIO has been extended before with no one complaining, and we're doing just that... extending it once again), it's this small (very very few lines and pretty much safe to apply), and has well-understood semantics (most of the behaviour was there already), I'd like to see feedback that has concrete points on where the patch should change. The initial interaction was requirement-gauging which is unavoidable. There have been useful exchanges as well, but at some point the review derailed. I'm not seeing any concrete points on what should change for acceptance. And out of nowhere a competitor to POSIX AIO that I do not want to design has been suggested. Be free to design it, but don't block POSIX AIO patches while developing your new subsystem at your own pace. In the meantime there's an existing problem that could be fixed (and no viable concrete alternative that would fix the problem that I'm facing has been proposed). > I would like to see a design that: > * Is extensible to most file system and networking syscalls, even if > it doesn't include them right now. At a minimum, it should be able to > include fspacectl, copy_file_range, truncate, and posix_fallocate. > Probably open too. That's not POSIX AIO. Any POSIX AIO extension will be rejected by your criterias. The current POSIX AIO is already rejected by your criterias. Should we remove POSIX AIO then? POSIX AIO is practically useless without extensions. It's only useful in the BSD world where it has an extension for kqueue integration. FreeBSD even has other extensions besides kqueue integration (e.g. aio_readv()). LIO_FOFFSET won't prevent you from developing and proposing a new FreeBSD async IO API that competes with POSIX AIO. You can design your POSIX AIO competitor at your own pace with no rush. In the meantime, I have a problem to be fixed. > * Is reviewed by kib and Thomas Munro. We can get to Thomas Munro once we solve your own requirements. Can you elaborate requirements that are actually possible to meet? A patch for POSIX AIO must meet the POSIX AIO mindset. You're asking for a new subsystem that has nothing to do with POSIX AIO. > * Has completion notification delivered by kqueue. Okay. > * Is race-resistant. Race-resistant? Filesystem is a global resource. You're specifically asking for a solution that cannot be developed (and all existing APIs already violate). Can you be more specific? > [...] That's what a good asynchronous API looks like io_uring is a good async API. Among other things, it offers read() using current file offset. What part of io_uring became "bad" because it allows skipping an explicit offset? And you're too vague when you talk about "races". Again: filesystem is a global resource. Even if your process is not creating races, the interaction between different processes might create races, and there's nothing to do here. -- Vin=C3=ADcius dos Santos Oliveira https://vinipsmaker.github.io/