Re: removing support for kernel stack swapping

From: Warner Losh <imp_at_bsdimp.com>
Date: Mon, 03 Jun 2024 00:05:06 UTC
On Sun, Jun 2, 2024, 5:57 PM Mark Johnston <markj@freebsd.org> wrote:

> FreeBSD will, when free pages are scarce, try to swap out the kernel
> stacks (typically 16KB per thread) of sleeping user threads.  I'm told
> that this mechanism was first implemented in BSD for the VAX port and
> that stabilizing it was quite an endeavour.
>
> This feature has wide-ranging implications for code in the kernel.  For
> instance, if a thread allocates a structure on its stack, links it into
> some data structure visible to other threads, and goes to sleep, it must
> use PHOLD to ensure that the stack doesn't get swapped out while
> sleeping.  A missing PHOLD can thus result in a kernel panic, but this
> kind of mistake is very easy to make and hard to catch without thorough
> stress testing.  The kernel stack allocator also requires a fair bit of
> code to implement this feature, and we've had multiple bugs in that
> area, especially in relation to NUMA support.  Moreover, this feature
> will leave threads swapped out after the system has recovered, resulting
> in high scheduling latency once they're ready to run again.
>
> In a very stressed system, it's possible that we can free up something
> like 1MB of RAM using this mechanism.  I argue that this mechanism is
> not worth it on modern systems: it isn't going to make the difference
> between a graceful recovery from memory pressure and a catatonic state
> which forces a reboot.  The complexity and resulting bugs it induces is
> not worth it.
>


+1.

The smallest bootable system for me is like 256MB, and in a system like
that it might save 256k given the number of threads typical in a system
like that...

Warner

At the BSDCan devsummit I proposed removing support for kernel stack
> swapping and got only positive feedback.  Does anyone here have any
> comments or objections?
>
>