Re: Can not build kernel on 1GB VM

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 17 May 2022 21:10:06 UTC
On 2022-May-17, at 12:40, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-May-17, at 06:56, Michael Wayne <freebsd07@wayne47.com> wrote:
> 
>> On Mon, May 16, 2022 at 12:07:18PM -0700, Mark Millard wrote:
>>> On 2022-May-16, at 07:37, Michael Wayne <freebsd07@wayne47.com> wrote:
>>> 
>>>> More info. I am running UFS so the ZFS should not be an issue
>>>> 
>>>> % pstat -s
>>>> Device          1K-blocks     Used    Avail Capacity
>>>> /dev/md99         1048576        0  1048576     0%
>>> 
>>> That may well explain some (or all?) of what is going on:
>>> file-system/vnode backed swap spaces are subject to deadlocks.
>> 
>> 
>>> Which suggested patch(s)? Any patches for . . .
>> 
>> This one line change (patch reduced to only relevant info) that
>> was posted earlier on the list.
>>  --- a/sys/vm/vm_pageout.c
>>  +++ b/sys/vm/vm_pageout.c
>>  @@ -1069,7 +1069,7 @@ vm_pageout_laundry_worker(void *arg)
>>  -               if (target == 0 && ndirty * isqrt(howmany(nfreed + 1,
>>  +               if (target == 0 && ndirty * isqrt(howmany(nfreed,
> 
> Why my brain was stuck on "patches for messaging better"
> I do not know. Seems obvious now that you show it.
> Sorry for the noise.
> 
>>> Can you switch to using a swap partition instead of
>>> file-system/vnode backed swap space?
>> 
>> AFAICT, this would require a reinstall as there's no easy way to 
>> shrink the existing image. 
> 
> You may ultimately have a requirement that the image
> containing the UFS file system and swap space be no
> more than some specific size), implying that the UFS
> file system would need to shrink to make room.
> 
> But, for testing, could a copy of the image file for
> the VM be made and then be grown larger and then a
> freebsd-swap partition added in it for use as a swap
> space? Then: switch to using the partition to see what
> happens? At least we might learn if the issue by
> itself is sufficient to avoid the problem you are
> having.
> 
> 18.3 "Resizing and Growing Disks" in:
> 
> https://docs.freebsd.org/en/books/handbook/disks/
> 
> has notes some notes about doing such growing and
> mentions virtual machines --but uses an ada0
> example. It includes adding a swap partition.
> 
> It may be best for such a test to have a larger
> than intended swap space as an initial test and
> then, if things work, to change the swap partition
> to be smaller and see if it still works, possibly
> bisecting to find what an approximate minimum
> size is and then judging what to use from that.
> 
> (Either way, relative to avoid the existing problem
> that you are having, based on my experiences, I would
> never recommend file-system/vnode based swap spaces
> be used.)
> 
>> Summary of events to date:
>> - This was installed as a lightweight machine. It will never hit swap in
>> "normal" operation.
> 
> The toolchain's memory use has been growing as the
> versions progress. Fixed RAM+SWAP sizes over long
> periods of time are not realistic as stands --unless
> room was provided for growth up front. Because of
> kernel data structure tradeoffs, SWAP sizing is
> effectively constrained by RAM size. For 64-bit
> contexts, something like 3.8*RAM avoids notices
> of mistuning when the swap is added.
> 
> (An unfortunate property of the mistuning-message
> is it makes a suggestion that involves other
> tradeoffs for other kernel resources as I understand
> --without giving any hint that such a tradeoff is
> involved.)
> 
>> - The only reason I added a swap file was that someplace in 11.x building
>> the kernel ran out of memory.
> 
> Toolchain again, yep.
> 
>> - I only build a custom kernel to get options TCP_SIGNATURE for bird.
> 
> Any chance that you could build the kernel outside the
> specific VM (in a less constrained context) and somehow
> transfer it into the VM?
> 
>> - The swap file worked correctly for all of 11.x until I tried to build 12.x.
> 
> file-system/vnode backed swap spaces do not fail on
> every use but are unreliable. Back when I was isolating
> the failures that I was seeing, I found notes from
> a wide range of time frames from folks having problems,
> as I remember. (Not that I could point you at them any
> more.) The problem did not seem to be new, predating 11
> for sure.
> 
> The toolchain memory usage growth over time has probably
> lead to reaching failure for file-system backed swap
> spaces ever more often over time for ever less
> constrained contexts.
> 
>> - There likely out to be a FAQ or handbook page about how to lay
>> out lightweight machines. Having used it since 4.x, 1 GB "seems" like
>> a pretty big machine, yet these issues arose.
> 
> If partition based swap spaces prove insufficient,
> it may require adding messaging to the kernel that
> reports on just which of the 4 conditions is
> leading to the kills --and possibly related
> information for the one that is happening. That
> context might be required to make progress.
> 
> 

I'll note an old buzilla about OOM kills:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241048

started on 2910-Oct-04. [12.1-RELEASE was released on
2019-Nov-04 --and ended support on 2021-Jan-31.]

The bugzilla was resolved as fixed on 2020-07-11, but
the criteria is a little indirect and your case might
well have prevented such a classification if it was
known back then. (Too late now for 12.1-RELEASE-p* .)

If you still can, may be it is better to skip having
12.1-RELEASE involved in the upgrade sequence?

===
Mark Millard
marklmi at yahoo.com