Re: Can not build kernel on 1GB VM

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 17 May 2022 19:40:32 UTC
On 2022-May-17, at 06:56, Michael Wayne <freebsd07@wayne47.com> wrote:

> On Mon, May 16, 2022 at 12:07:18PM -0700, Mark Millard wrote:
>> On 2022-May-16, at 07:37, Michael Wayne <freebsd07@wayne47.com> wrote:
>> 
>>> More info. I am running UFS so the ZFS should not be an issue
>>> 
>>>  % pstat -s
>>>  Device          1K-blocks     Used    Avail Capacity
>>>  /dev/md99         1048576        0  1048576     0%
>> 
>> That may well explain some (or all?) of what is going on:
>> file-system/vnode backed swap spaces are subject to deadlocks.
> 
> 
>> Which suggested patch(s)? Any patches for . . .
> 
> This one line change (patch reduced to only relevant info) that
> was posted earlier on the list.
>   --- a/sys/vm/vm_pageout.c
>   +++ b/sys/vm/vm_pageout.c
>   @@ -1069,7 +1069,7 @@ vm_pageout_laundry_worker(void *arg)
>   -               if (target == 0 && ndirty * isqrt(howmany(nfreed + 1,
>   +               if (target == 0 && ndirty * isqrt(howmany(nfreed,

Why my brain was stuck on "patches for messaging better"
I do not know. Seems obvious now that you show it.
Sorry for the noise.

>> Can you switch to using a swap partition instead of
>> file-system/vnode backed swap space?
> 
> AFAICT, this would require a reinstall as there's no easy way to 
> shrink the existing image. 

You may ultimately have a requirement that the image
containing the UFS file system and swap space be no
more than some specific size), implying that the UFS
file system would need to shrink to make room.

But, for testing, could a copy of the image file for
the VM be made and then be grown larger and then a
freebsd-swap partition added in it for use as a swap
space? Then: switch to using the partition to see what
happens? At least we might learn if the issue by
itself is sufficient to avoid the problem you are
having.

18.3 "Resizing and Growing Disks" in:

https://docs.freebsd.org/en/books/handbook/disks/

has notes some notes about doing such growing and
mentions virtual machines --but uses an ada0
example. It includes adding a swap partition.

It may be best for such a test to have a larger
than intended swap space as an initial test and
then, if things work, to change the swap partition
to be smaller and see if it still works, possibly
bisecting to find what an approximate minimum
size is and then judging what to use from that.

(Either way, relative to avoid the existing problem
that you are having, based on my experiences, I would
never recommend file-system/vnode based swap spaces
be used.)

> Summary of events to date:
> - This was installed as a lightweight machine. It will never hit swap in
>  "normal" operation.

The toolchain's memory use has been growing as the
versions progress. Fixed RAM+SWAP sizes over long
periods of time are not realistic as stands --unless
room was provided for growth up front. Because of
kernel data structure tradeoffs, SWAP sizing is
effectively constrained by RAM size. For 64-bit
contexts, something like 3.8*RAM avoids notices
of mistuning when the swap is added.

(An unfortunate property of the mistuning-message
is it makes a suggestion that involves other
tradeoffs for other kernel resources as I understand
--without giving any hint that such a tradeoff is
involved.)

> - The only reason I added a swap file was that someplace in 11.x building
>  the kernel ran out of memory.

Toolchain again, yep.

> - I only build a custom kernel to get options TCP_SIGNATURE for bird.

Any chance that you could build the kernel outside the
specific VM (in a less constrained context) and somehow
transfer it into the VM?

> - The swap file worked correctly for all of 11.x until I tried to build 12.x.

file-system/vnode backed swap spaces do not fail on
every use but are unreliable. Back when I was isolating
the failures that I was seeing, I found notes from
a wide range of time frames from folks having problems,
as I remember. (Not that I could point you at them any
more.) The problem did not seem to be new, predating 11
for sure.

The toolchain memory usage growth over time has probably
lead to reaching failure for file-system backed swap
spaces ever more often over time for ever less
constrained contexts.

> - There likely out to be a FAQ or handbook page about how to lay
>  out lightweight machines. Having used it since 4.x, 1 GB "seems" like
>  a pretty big machine, yet these issues arose.

If partition based swap spaces prove insufficient,
it may require adding messaging to the kernel that
reports on just which of the 4 conditions is
leading to the kills --and possibly related
information for the one that is happening. That
context might be required to make progress.


===
Mark Millard
marklmi at yahoo.com