Re: Can not build kernel on 1GB VM

From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 15 Apr 2022 22:41:25 UTC
From: Michael Wayne <freebsd07_at_wayne47.com> 
Date: Fri, 15 Apr 2022 15:17:43 -0400 :

> On Fri, Apr 15, 2022 at 11:40:02AM -0700, Mark Millard wrote:
> > From: Michael Wayne <freebsd07_at_wayne47.com> 
> > Date: Fri, 15 Apr 2022 13:49:53 -0400 :
> > 
> > > I'm trying to upgrade the machine to 12.3 and having swap failures.
> 
> I reduced the swapspace back to 1 GB. It's only ever really hit during 
> builds.
> 
> I set
>    vm.pageout_oom_seq=120
>    vm.pfault_oom_attempts=-1
> 
> There was no improvement. I still see processes getting killed due
> to no swap space despite only 7-8 MB being reported used.  It sorta
> feels like it's not really able to use swap at all.
> 
> Note that everything worked fine on 11.x, this is a new issue on 12.


It is really unfortunate that the 3 or 4 conditions that
initiate the OOM kill activity are not reported as being the
specific initiator of the activity in 12.x . (Mostly fixed
in sufficiently modern contexts. 2 of the conditions are
very similar and tend to be treated as 1, leading to
3 instead of 4. The other 2 have detail-specific wording
these days.)

I went looking back in time and 12.1-RELEASE-p3 has logic for
vm.pageout_oom_seq (2015) and vm.pfault_oom_attempts (2019-Sep)
from what I can tell.

If using vm.pageout_oom_seq=120 made it take longer before
the OOM activity, then further increases could be appropriate.
I've never had to use more than 120 but I know one person
used something like 1200 on a low-end arm Small Board Computer
(1 GiByte RAM, microsd card media in use for swap, as I
remember). I do not know that, say, 600 would not have worked,
however.

That vm.pfault_oom_attempts=-1 did not stop the issue should
eliminate "a thread waited too long to allocate a page" (modern
message) as I understand from looking at the code. That is
despite your report of:

QUOTE
    Apr 15 12:11:26 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240593, size: 4096
    Apr 15 12:11:35 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 236224, size: 16384
    Apr 15 12:11:37 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 245, size: 12288
    Apr 15 12:11:46 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240593, size: 4096
    Apr 15 12:11:55 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 236224, size: 16384
    Apr 15 12:11:57 g1 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 245, size: 12288
END QUOTE

types of notices.

If vm.pageout_oom_seq=120 made no noticable increase in how
far things got (how long it ran) before the OOM activity, it
is possible that you reach one of the 2 conditions that are
treated as VM_OOM_SWAPZ. In such a case, either increasing
the RAM space available or doing kern.maxswzone tuning may
well be the only options to do the 12.3 build from the
existing 12.1-RELEASE-p3 context.

I've no hint to give for kern.maxswzone tuning.

There is the possibility of creating a 12.3 /boot/kernel/
area from the likes of:

http://ftp3.freebsd.org/pub/FreeBSD/releases/amd64/12.3-RELEASE/kernel.txz

and possibly also creating/replacing the matching
/usr/lib/debug/boot/kernel/ debug information (if you keep
such):

http://ftp3.freebsd.org/pub/FreeBSD/releases/amd64/12.3-RELEASE/kernel-dbg.txz

This would avoid the kernel build.

( base.txz use is not as reasonable as it would replace
configuration files with default versions and the like. )


I've CC'd Mark Johnston who has sometimes been able to help
with these types of problems and how to avoid/control them.
He is also the one that has improved the messaging in more
modern FreeBSD versions.

I include a reference to your original message for Mark J.
in case he has time to look, since the content has been
stripped in the message I'm replying to:

https://lists.freebsd.org/archives/freebsd-hackers/2022-April/001018.html

===
Mark Millard
marklmi at yahoo.com