Re: [PATCH] Solaris Doors IPC

From: David Chisnall <theraven_at_FreeBSD.org>
Date: Tue, 01 Feb 2022 17:15:06 UTC
Hi Bojan,

I am *really* excited to see someone working on Doors.  It's a fantastic 
IPC mechanism and I have some code that would love to be able to use 
Doors.  I had an intern explore the minimum viable kernel support in 
Linux for this kind of mechanism a couple of years ago.

This looks as if it's a straight adaptation of the Solaris Doors kernel 
API, which was a localisation of Spring Doors fit with the abstractions 
in the Solaris kernel.  I wonder if it's worth taking a step back to see 
what shape a FreeBSD localisation of Spring Doors would look like.  In 
particular, the Solaris Doors APIs were designed at a time when Solaris 
was deep in the N:M threading model and a lot of the abstractions (for 
example, the thread pool logic) were designed on this assumption and on 
the capability of hardware with very different tradeoffs to today.

There are a few somewhat separable bits of Doors that it might be worth 
trying to land patches for independently.  Doors provide:

  - A low-latency mechanism for directly waking up another thread.  This 
allows RPCs without scheduler latency and is the critical part of Doors 
- everything else can be implemented with existing primitives.
  - A mechanism for moving arguments to an RPC from one process to 
another.  This involves (at least) one of three mechanisms:
    * Copying the argument registers from the source to the target.
    * Copying small amounts of data through an in-kernel buffer.
    * Page flipping large objects.
    * Duplicating file descriptors from the caller to the callee (I'd 
love to see this extended to support Capsicum and with a revocation 
mechanism by disallowing dup / dup2 on the fd and having the kernel 
close it on return from the Door invocation.  This would allow temporary 
delegation of access to files for the duration of a single request.  For 
example, asking another process to append something to a file and 
guaranteeing that they can't write to the file after they returned).
  - A 'shuttle' mechanism that allows returns through failure, including 
in the case of reentrancy where one process invokes another via a door, 
the second calls back into the first in the same thread, and then 
crashes.  The first thread must be able to unwind to the start of the 
first door call.
  - An IDL for defining Door interfaces and generating the call gates.
  - A mechanism for managing a thread pool.

Solaris also provided a mechanism for attaching a Door into the 
filesystem namespace.  I believe this interface was intended to become a 
generic mechanism for attaching any anonymous IPC mechanism to the 
filesystem namespace but it was never evolved like this in Solaris.

In a 1:1 threading model and a FreeBSD-like kernel, I'd expect that 
you'd want to move all of the thread-pool management into userspace and 
provide two kernel primitives for doing this:

  - A mechanism for waiting on more than one door (including one from 
another thread in the same process that.
  - A mechanism for a thread to be notified that there are pending door 
invocations and that it should create threads to handle them.

The second of these could easily be done via kqueue (and so integrated 
into the main run loop for an event-driven program).  I don't think 
normal fast-path door invocation can go via kqueue and have the 
scheduler behaviour that's necessary for doors to be useful.

For the data transmission, it's not clear what the right separation of 
concerns is.  We now have solid support for anonymous shared memory 
objects.  There isn't a way of moving ownership of pages from anonymous 
memory to a named file descriptor or anything like Linux's sealing 
mechanism on memfd.

For the  KDBUS work in Linux, some of this was punted to userspace.  A 
sysctl to tell userspace the size below which copying via the kernel is 
expected to be faster than moving ownership of pages, combined with 
memfd-like sealing would provide one possible implementation without 
needing the page-flipping policy to be in the kernel.  The underlying 
primitive that we wanted when imagining a less error-prone[1] version of 
memfd sealing was a mechanism for transitioning memory between 
MAP_SHARED and MAP_PRIVATE states, such that any write to the pages 
would be local until you discarded your local copy and went back to 
seeing things in the original.  This seems like the same mechanism that 
you'd want for the page-flipping path in Doors and so it would be nice 
to have a general mechanism for it, even if Doors uses it automatically.

Looking at your patch, I couldn't quite follow how the Door activation 
worked.  It seemed as if this was just using condition variables to wake 
up the receiving end and so required the scheduler to decide to invoke 
the other thread.  Sorry if I missed something (a high-level design doc 
would help here).  This is, in my mind, the most important part of Doors 
to get right: the guarantee from both Spring and Solaris is that the 
invoked thread runs with the caller's scheduler priority (which avoids 
priority inversion) and in the caller's quantum (which avoids scheduler 
latency).  Without this, the RPC can have milliseconds of latency in the 
common case.  I'd suggest starting from the point before trying to build 
a fully featured Doors implementation: what is the minimum thing that 
you need in the FreeBSD kernel to allow a thread to wake up another and 
have it run in the same quantum, with the lowest possible latency?

David

[1] The problem with sealing in memfd is that the security depends on 
getting the code in error-handling paths right.  This code is only ever 
tested when you're under attack, so it is probably buggy.

On 30/01/2022 12:14, Bojan Novković wrote:
> Hello everyone!
> 
> I have completed a patch which implements the Solaris Doors IPC 
> mechanism for FreeBSD.
> 
> Since the patch is huge and requires new library code to be usable, I 
> uploaded the diff and the backing code to a git repo [1] on top of 
> opening a differential on Phabricator [2].
> 
> Kind regards,
> Bojan Novković
> 
> [1] https://github.com/bnovkov/freebsd-doors
> 
> [2] https://reviews.freebsd.org/D34097
> 
>