Re: removing support for kernel stack swapping
- In reply to: Warner Losh : "Re: removing support for kernel stack swapping"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 03 Jun 2024 09:30:29 UTC
On Sun, Jun 02, 2024 at 08:05:06PM -0400, Warner Losh wrote: > On Sun, Jun 2, 2024, 5:57 PM Mark Johnston <markj@freebsd.org> wrote: > > > FreeBSD will, when free pages are scarce, try to swap out the kernel > > stacks (typically 16KB per thread) of sleeping user threads. I'm told > > that this mechanism was first implemented in BSD for the VAX port and > > that stabilizing it was quite an endeavour. > > > > This feature has wide-ranging implications for code in the kernel. For > > instance, if a thread allocates a structure on its stack, links it into > > some data structure visible to other threads, and goes to sleep, it must > > use PHOLD to ensure that the stack doesn't get swapped out while > > sleeping. A missing PHOLD can thus result in a kernel panic, but this > > kind of mistake is very easy to make and hard to catch without thorough > > stress testing. The kernel stack allocator also requires a fair bit of > > code to implement this feature, and we've had multiple bugs in that > > area, especially in relation to NUMA support. Moreover, this feature > > will leave threads swapped out after the system has recovered, resulting > > in high scheduling latency once they're ready to run again. > > > > In a very stressed system, it's possible that we can free up something > > like 1MB of RAM using this mechanism. I argue that this mechanism is > > not worth it on modern systems: it isn't going to make the difference > > between a graceful recovery from memory pressure and a catatonic state > > which forces a reboot. The complexity and resulting bugs it induces is > > not worth it. > > > > > +1. > > The smallest bootable system for me is like 256MB, and in a system like > that it might save 256k given the number of threads typical in a system > like that... > > Warner > I managed to boot on 10mb of on-chip static RAM (no DDR at all), including a few mb of mdroot. But now mostly using DDR2/3 which is no way to get less that 32mb, so 1mb is not a problem at all. Ruslan