per-FIB socket binding

Reply: Bjoern A. Zeeb: "Re: per-FIB socket binding"
Reply: Mark Johnston : "Re: per-FIB socket binding"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Mark Johnston <markj_at_freebsd.org>
Date: Tue, 17 Dec 2024 18:15:07 UTC

Lately I've been working on adding FIB awareness to bind(2) and inpcb lookup.
Below I'll describe the project a bit.  Any feedback/comments/suggestions would
be appreciated.

Today, a TCP or UDP socket can receive connections or datagrams from any FIB.
Suppose a SYN arrives on an interface in FIB 1.  A TCP listening socket attached
to FIB 0 may receive the SYN and create a new connection; the FIB of the new
socket is inherited from the listening socket, so the new connection will also
belong to FIB 0 even though the SYN was associated with FIB 1.  As long as FIB 0
has a route to the SYN's source address, the connection will work.

For some applications, one may prefer to ensure that the connection is
associated with the FIB of the incoming SYN; if no socket is listening in that
FIB, the connection would be dropped.  We could have a mode where accept() puts
the new socket in the FIB of the incoming SYN, rather than that of the listening
socket, but that doesn't help for connectionless sockets.

This is useful if one has a service with per-FIB configurations and wants to run
multiple instances without having to specify non-overlapping addresses for them
to listen on.  Or, if one wants to run a service only in a specific FIB for
whatever reason.

To implement this, I propose having per-VNET tunables for TCP, UDP and raw
sockets, with the following effects:
- Multiple sockets can bind to the same addr/port (INADDR_ANY in particular), so
  long as they belong to different FIBs and all are owned by the same user.
- SO_REUSEPORT and SO_REUSEPORT_LB can still be used to share a port among
  sockets in the same FIB.
- When in_pcblookup() goes off to find an inpcb to handle a received packet,
  only inpcbs belonging to the same FIB as the packet will be returned.  If no
  such inpcb exists, the packet is dropped, even if an inpcb in a different FIB
  could handle the packet.

This would be opt-in behaviour since it can easily break existing applications.
In particular, it'd be easy to lock oneself out of a system if, say, one relies
on being able to ssh in from a non-default FIB.  That said, I do think these
semantics are a bit more intuitive than the default ones.

I've implemented most of this locally; I'm still working on documentation and
test cases, so haven't posted patches for review yet, aside from some
preparatory cleanup and bind(2) test cases.  I aim to have things in review
sometime in January.

Any thoughts/comments?