Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 31 Oct 2022 04:33:18 UTC
On 2022-Oct-30, at 21:00, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Oct-30, at 19:47, Archimedes Gaviola <archimedes.gaviola@gmail.com> wrote:
> 
>> On Mon, Oct 31, 2022 at 1:29 AM Mark Millard <marklmi@yahoo.com> wrote:
>> Archimedes Gaviola <archimedes.gaviola_at_gmail.com>wrote on
>> Date: Sun, 30 Oct 2022 13:41:52 UTC :
>> 
>>> I am building a kernel and world in 14.0-CURRENT
>>> https://download.freebsd.org/ftp/snapshots/arm64/aarch64/ISO-IMAGES/14.0/FreeBSD-14.0-CURRENT-arm64-aarch64-RPI-20221027-769b884e2e2-258837.img.xz
>>> with Raspberry Pi 3B  (ARM kernel config file and in default system
>>> configurations) and compilation breaks due to "failed to reclaim memory"
>>> error as found in the dmesg.
>>> 
>>> pid 91224 (llvm-tblgen), jid 0, uid 0, was killed: failed to reclaim memory
>>> pid 91131 (make), jid 0, uid 0, was killed: failed to reclaim memory
>>> 
>>> Here's the set of the build commands I invoked.
>>> 
>>> root@generic# cd /usr/src ; make KERNCONF=ARM TARGET_ARCH=aarch64
>>> buildkernel buildworld installkernel installworld distribution
>>> DESTDIR=/home/freebsd/rpi3b
>>> 
>>> . . .
>>> 
>>> Any thoughts? As I don't have any idea about VM pageout.
>> 
>> Hi Mark,
>> 
>> 
>> Multiple configuration things from what I use:
>> 
>> I use a swap partition (not a swap file!) to give the system
>> someplace to put copies of inactive memory pages (paging):
>> 
>> # swapinfo
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/gpt/Rock64swp2   3670016        0  3670016     0%
>> 
>> Oh I see, there's no swap partition in the default installation.
>> 
>> root@generic:~ # swapinfo
>> Device          1K-blocks     Used    Avail Capacity
>> 
>> root@generic:~ # top -S
>> last pid: 92429;  load averages:  0.00,  0.00,  0.00                                                           up 1+12:49:11  23:41:10
>> 52 processes:  2 running, 48 sleeping, 2 waiting
>> CPU:  0.0% user,  0.0% nice,  0.6% system,  0.0% interrupt, 99.4% idle
>> Mem: 2000K Active, 626M Inact, 223M Wired, 97M Buf, 20M Free
>> 
>> Let me try to create a swap partition. Let me mount a spare USB flash drive for swap as during the installation all my microSD card storage was allocated in the root filesystem with growfs.  
>> 
>> 
>> where gpart show -p lists it as (a gpt context, not MBR):
>> 
>>      534528     7340032  da0p2  freebsd-swap  (3.5G)
>> 
>> and gpart show -pl lists it as:
>> 
>>      534528     7340032  da0p2  Rock64swp2  (3.5G)
>> 
>> Okay noted on GPT not MBR method with gpart.
> 
> I did not happen to have a MBR example around. So I could
> only show GPT. The note was more to avoid confusion than
> anything, since the two are not equivalent for how they
> work.
> 
>> By the way, what's the proper allocation size of swap in FreeBSD?
> 
> FreeBSD has a waring that it produces indicating possible mistuning
> when you potentially have too much. An example is:

I seem to have been on a mission to make some typos . . .

"warning" would correct the above.

> warning: total configured swap (2097152 pages) exceeds maximum recommended amount (916632 pages).
> warning: increase kern.maxswzone or reduce amount of swap.
> 
> The numbers are dependent on the amount of RAM present and
> other details.
> 
> My understanding is that increasing kern.maxswzone has tradeoffs.
> I avoid getting the message because I do not understand the
> tradeoffs or how to manage the tradeoffs or even how to identify
> an instance of hitting such a tradeoff.
> 
> For aarch64 I've been about to have swap of about 3.4 to 3.5 or
> so times the amount of RAM without getting the warnings.

"allowed", not "about". (This one was potentially confusing
enough to justify this message.)

> That
> is why 3.5G in my RPi3B example. (So RAM+SWAP approx.= 4.5*RAM.)
> (armv7 only allows more like 1.8 times the RAM before getting
> the warning.)
> 
> I avoid even getting too close to the warning as there seems to
> be some build-to-build variability in what fits vs. not. This
> avoids having to frequently adjust the size.
> 
> Going from the other side, how much RAM+SWAP will your activities
> use? To avoid accurately figuring out such, you may just want to
> have near the 3.4 to 3.5 times RAM. (There have been times when
> clang had memory use oddities that required more than normal for
> a time, for example.)
> 
>> This RPi 3B has 1GB of RAM (~947 MB), do I need to set twice the capacity of this physical RAM?
> 
> Ultimately your choice. How much parallel activity you
> want to attempt likely contributes. If you build ports,
> you might do so in a way that uses more RAM+SWAP than
> system builds do, for example.
> 
>> (Note: swap file usage is subject to deadlock conditions
>> avoided by use of swap partitions.)
>> 
>> This is noted.
>> 
>> 
>> I use a serial console & ssh session only context to avoid
>> having sizable competition for RAM.
>> 
>> I avoid using tmpfs because it competes for RAM use.
>> 
>> I use the likes of ( in, say, /boot/loader/conf ):
>> 
>> #
>> # Delay when persistent low free RAM leads to
>> # Out Of Memory killing of processes:
>> vm.pageout_oom_seq=120
>> 
>> This delays potential "killed: failed to reclaim memory" kills,
>> possibly long enough to reach a state where sufficient memory is
>> reclaimed.
>> 
>> Alright this is well noted too.
> 
> There is tuning related to "a thread waited too long to
> allocate a page" that happens because of paging I/O
> characteristics. But but I've not hit that type of
> error.
> 
> I'll also note that the "out of swap space" case is a
> misnomer in that it is one or two of 2 internal data
> structures that is out of space, not necessarily the
> swap space on the media. Again, I've not ever hit that
> type of error. I'm not aware of tuning for this case.
> 
>> I'll note that the status "killed: failed to reclaim memory" does
>> not require that swap be used much at all. Sustained low free RAM
>> from just one process that always stays runnable and has a
>> sufficiently large active set of pages can be sufficient to end up
>> with such kills. Having swap allows for inactive pages to get out
>> of the way, which can help.
>> 
>> I use the likes of ( in, say, /etc/ssyctl.conf ):
>> 
>> #
>> # Together this pair avoids swapping out the process kernel stacks.
>> # This avoids processes for interacting with the system from being
>> # hung-up.
>> vm.swap_enabled=0
>> vm.swap_idle_enabled=0
>> 
>> This allows paging to the swap space but disallows moving
>> kernel thread stacks to the swap space. Otherwise the
>> processes used to interact with the RPi3 can become
>> non-runnable, preventing such interactions.
>> 
>> Okay this too is well noted.
>> 
>> 
>> I have NVMe or SSD based USB media, not microsd cards nor
>> spinning rust. (I use just bootcode.bin and timeout files
>> on microsd media for the RPi3B. Even the rest of the RPi*
>> firmware is on the USB media, as well as u-boot.bin .)
> 
> This may contribute to why I've never gotten a "a thread
> waited too long to allocate a page" on any system. (Some
> systems, while bootable via USB3 media I have, also have
> have even faster internal media that is normally used.)
> 
>> My usage of such a configuration struture for building
>> software (world, kernel, ports) applies to all the
>> systems I do such with, including ones with a lot more
>> resources, including a lot more RAM.
>> 
>> Thanks for these inputs, noted on these things! I haven't tried NVMe and SSD media in my RPi 3B. So, they are far more superior as compared to microSD cards when it comes to building software?
> 
> My understanding is that microsd card media is fairly
> generally not as good for such contexts: slower, fails
> sooner, etc.
> 
> I happen to boot multiple types of machines from the
> same media so I use USB3 media that is compatible with
> USB2 use, a single such USB3 device not needing a
> powered hub for use on the likes of an RPi3B. (Lots
> of USB3 media around would require external power for
> USB2 or an RPi3B use.) I need a powered hub for 2 or
> more such media on a RPi3B.
> 

I'll note that the NVMe USB3 media seems to take longer
than normal to power-up/reset. I ended up needing to
have, for example, usb_pgood_delay assigned for
u-boot.bin use in booting. The older USB3 SSD media
did not require such, and still does not. This issue
is not limited to booting RPi*'s in my context.
(Although, where I use EDK2 UEFI/ACPI style booting,
I've not had to do anything special for EDK2.)


===
Mark Millard
marklmi at yahoo.com