Re: It's not Rust, it's FreeBSD (and LLVM)

From: David Chisnall <theraven_at_FreeBSD.org>
Date: Tue, 03 Sep 2024 16:37:54 UTC
On 3 Sep 2024, at 16:32, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> 
> And when it does, LLVM, source code we import verbatim from an
> entirely different project, and which no sane person would call
> "Related to FreeBSD", takes up more than three quarters of the
> compile time.

It’s worse than that because, although we import the *source code* verbatim (mostly, occasionally with a few back-ported or not-yet-upstreamed fixes), we don’t import the *build system*.  We replicate a subset of what the upstream build system can do.

>  The only reason we do that, is because we stil have that outdated
> "FreeBSD is src" emotional hangup.

I don’t think this is quite the emotional hangup I have.  It doesn’t matter at all to me where the code lives.  If we ripped LLVM out of the src tree and built a package using the code that’s currently in ports with CMake + Ninja + pkg, and made it part of the core distribution, it would still be ‘FreeBSD’, because it’s the POLA boundary.

If I learn how something works in the bit of the system that is FreeBSD, I don’t have to relearn it unless there’s a compelling reason why the old abstractions can no longer reflect the new world (around 9ish, there was a change in how WiFi was configured, for example, but wired Ethernet for IPv4 is still configured the way I learned in 4.x, because it got faster but not qualitatively different).

Even though cc and c++ were gcc in 4.x and are clang now, I still invoke them the same way.  They now support newer things (C++23 and C2x, not just C++98 and C99, hurray!), but the older things still mostly work the same way.

Similarly, anything that’s a library in the bit that is FreeBSD is either marked as private, or has a stable ABI throughout (and, ideally, beyond) a release series (LLVM does not have this property, which is why we ship Clang, lld, lldb, and so on, as tools that use LLVM, we don’t ship *LLVM* in the base system).  If we ripped contrib out of src and kept the same guarantee from the git clones of the upstream things, that would be fine.

I expect that FreeBSD 15.5’s C/C++ compiler will compile any C/C++ compilation unit that FreeBSD 15.0’s does (with a get-out-of-jail-free card for things that rely on undefined behaviour).  If it doesn’t, I expect that this is something that the project will treat as a bug and work with upstream to fix it.  

In contrast, if I install some compiler from the ports tree, I expect different (not necessarily weaker) guarantees.  I expect that the version from packages-built-from-ports have the same guarantees as the upstream project.  This may be a 6-monthly release cycle with support for things that fixed any deprecation warnings from the last two releases.  It may be a never-breaks-backwards-compatibility-ever guarantee, depending on the program / library.

For Rust in FreeBSD, I would expect one of two things:

Option 1: FreeBSD rustc is not binary that we supported binary for building anything outside of the base system.  Rust code in the base system may need updates to the compiler to be MFC’d at the same time (again, I have no opinions about where the code *lives*, MFC may be a submodule update, an update to a ports-like recipe, or a full code import).  We have a blessed set of packages.

Option 2: FreeBSD rustc is supported for third-party things, minor versions may bring in new ones, but within a release series we expect full backwards compatibility with the exception of things that fix soundness issues in the type checker (if cve-rs stops working within a major release series, that’s a feature not a bug).

I believe Rust now has strong enough guarantees for Option 2 to be feasible (I’m not 100% sure).  But that neither, to me, requires bringing rustc into contrib.  It requires providing a way of building an atomic snapshot that is the things that we define as FreeBSD N, and guarantees of compatibility between FreeBSD N.0 to N.M.

Note that some of the stability guarantees are already quite complicated and I think merit longer discussions.  For example, I’m told that some kernel modules broke across the 13.x series (in particular, VirtualBox ones) and needed recompiling.  My understanding was that we went and added padding to data structures before a .0 series to prevent this.  If this isn’t the case, then we need to be more explicit about *which* KBIs are stable / unstable across a major release.  The existence of LinuxKPI also complicates these discussions, because LinuxKPI is a compat layer that tracks an unstable target (whatever Linux is doing this week) and so, by definition, *cannot* have the same stability guarantees as the base system, but it is necessary for supporting most modern GPUs.

If anything, some of these stability guarantees are *more* valuable now than they have been for much of the last couple of decades.  The Linux world is struggling with containers because containers incorporate a userland from some distro, but inherit the kernel from the host.  Can you run an Ubuntu 20.04 and 22.04 container on your host?  Maybe.  FreeBSD is in a great position to ship a family of container base images for major releases and get all of the benefits that container platforms provide in terms of sharing from common images (two containers with the same set of base layers can share disk and buffer cache space for the common bits), while still providing the benefits of latest-version-of-third-party-things via ports / packages.  Being able to have a single supported version of a FreeBSD 15 base layer with the core libraries (and a set of additional layers that bring in progressively more things) and expect people to just point at freebsd15:latest as the base and rebuild their containers to pick up the latest bits is great, but depends on the compatibility guarantees that FreeBSD embodies (ABI for libraries, CLI for tools).  

David