Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build

From: Nuno Teixeira <eduardo_at_freebsd.org>
Date: Wed, 02 Nov 2022 21:44:43 UTC
Hello,

From
https://lists.freebsd.org/archives/freebsd-ports/2022-August/002476.html

---
With both FLANG and MLIR:
(...)
[13:49:55] [01] [13:49:17] Finished devel/llvm13 | llvm13-13.0.1_3: Success

load averages:   . . . MaxObs:   6.43,   5.91,   5.77
(Note: spanned overnight so the nightly cron job was
spanned.)

Note: Given that SWAP was used, I report more
Max(imum)Obs(erved) figures for this case than
I've been reporting for other tests:

5696Mi MaxObsActive
1775Mi MaxObsSwapUsed
7374Mi MaxObs(Act+Lndry+SwapUsed)
9333Mi MaxObs(Act+Wir+Lndry+SwapUsed)

Reminder: MaximumOfASum <= TheSumOfTheMaximums
Note: The various Maximums need not be from the same time.


By contrast . . .

No FLANG, no MLIR:

(...)
[11:07:48] [01] [08:58:53] Finished devel/llvm13 | llvm13-13.0.1_3: Success

load averages:   . . . MaxObs:   5.31,   4.94,   4.79

1479Mi MaxObs(Act+Lndry+SwapUsed)

So, vastly less RAM+SWAP space use. Somewhat under
5 hours less build time (about 9hr vs. somewhat under 14hr).
---

Archimedes Gaviola <archimedes.gaviola@gmail.com> escreveu no dia quarta,
2/11/2022 à(s) 21:10:

>
>
> On Mon, Oct 31, 2022 at 1:47 PM Archimedes Gaviola <
> archimedes.gaviola@gmail.com> wrote:
>
>>
>> > Okay noted on GPT not MBR method with gpart.
>>
>>
>>> I did not happen to have a MBR example around. So I could
>>> only show GPT. The note was more to avoid confusion than
>>> anything, since the two are not equivalent for how they
>>> work.
>>>
>>
>> Okay, this is noted.
>>
>>
>>>
>>> > By the way, what's the proper allocation size of swap in FreeBSD?
>>>
>>> FreeBSD has a waring that it produces indicating possible mistuning
>>> when you potentially have too much. An example is:
>>>
>>> warning: total configured swap (2097152 pages) exceeds maximum
>>> recommended amount (916632 pages).
>>> warning: increase kern.maxswzone or reduce amount of swap.
>>>
>>> The numbers are dependent on the amount of RAM present and
>>> other details.
>>>
>>> My understanding is that increasing kern.maxswzone has tradeoffs.
>>> I avoid getting the message because I do not understand the
>>> tradeoffs or how to manage the tradeoffs or even how to identify
>>> an instance of hitting such a tradeoff.
>>>
>>
>> Basically the warning messages you've shared are the messages I
>> encountered with my older FreeBSD system running on MIPS32 at the time I
>> allocated a swap partition because of the higher allocation size I've made.
>> So what I did is gradually adjust the swap size until such warnings
>> disappear. I did not go through the details as most likely it requires a
>> deeper knowledge on this area. That's why this experience illuminated me
>> again with my RPi 3B ARM system on the proper allocation size. But yes,
>> below you have the allocation size.
>>
>>
>>>
>>> For aarch64 I've been about to have swap of about 3.4 to 3.5 or
>>> so times the amount of RAM without getting the warnings. That
>>> is why 3.5G in my RPi3B example. (So RAM+SWAP approx.= 4.5*RAM.)
>>> (armv7 only allows more like 1.8 times the RAM before getting
>>> the warning.)
>>>
>>
>> Okay this is noted. I'll take the 3.5G size as this is based on your
>> actual experience.
>>
>>
>>>
>>> I avoid even getting too close to the warning as there seems to
>>> be some build-to-build variability in what fits vs. not. This
>>> avoids having to frequently adjust the size.
>>>
>>>
>> I, too, need to avoid such warnings as much as possible with this RPi 3B
>> configuration.
>>
>>
>>> Going from the other side, how much RAM+SWAP will your activities
>>> use? To avoid accurately figuring out such, you may just want to
>>> have near the 3.4 to 3.5 times RAM. (There have been times when
>>> clang had memory use oddities that required more than normal for
>>> a time, for example.)
>>>
>>
>> I'll just follow the size you have and let me observe how it goes.
>>
>>
>>>
>>> > This RPi 3B has 1GB of RAM (~947 MB), do I need to set twice the
>>> capacity of this physical RAM?
>>>
>>> Ultimately your choice. How much parallel activity you
>>> want to attempt likely contributes. If you build ports,
>>> you might do so in a way that uses more RAM+SWAP than
>>> system builds do, for example.
>>>
>>
>> Okay this is noted. For now, building the kernel and world is my goal, no
>> ports yet.
>>
>>
>>>
>>> > (Note: swap file usage is subject to deadlock conditions
>>> > avoided by use of swap partitions.)
>>> >
>>> > This is noted.
>>> >
>>> >
>>> > I use a serial console & ssh session only context to avoid
>>> > having sizable competition for RAM.
>>> >
>>> > I avoid using tmpfs because it competes for RAM use.
>>> >
>>> > I use the likes of ( in, say, /boot/loader/conf ):
>>> >
>>> > #
>>> > # Delay when persistent low free RAM leads to
>>> > # Out Of Memory killing of processes:
>>> > vm.pageout_oom_seq=120
>>> >
>>> > This delays potential "killed: failed to reclaim memory" kills,
>>> > possibly long enough to reach a state where sufficient memory is
>>> > reclaimed.
>>> >
>>> > Alright this is well noted too.
>>>
>>> There is tuning related to "a thread waited too long to
>>> allocate a page" that happens because of paging I/O
>>> characteristics. But but I've not hit that type of
>>> error.
>>>
>>> I'll also note that the "out of swap space" case is a
>>> misnomer in that it is one or two of 2 internal data
>>> structures that is out of space, not necessarily the
>>> swap space on the media. Again, I've not ever hit that
>>> type of error. I'm not aware of tuning for this case.
>>>
>>
>> Okay, noted as well on this info. Let me just try the 3.5G swap
>> allocation. I will post another thread if I ever encounter these types of
>> errors.
>>
>>
>>>
>>> > I'll note that the status "killed: failed to reclaim memory" does
>>> > not require that swap be used much at all. Sustained low free RAM
>>> > from just one process that always stays runnable and has a
>>> > sufficiently large active set of pages can be sufficient to end up
>>> > with such kills. Having swap allows for inactive pages to get out
>>> > of the way, which can help.
>>> >
>>> > I use the likes of ( in, say, /etc/ssyctl.conf ):
>>> >
>>> > #
>>> > # Together this pair avoids swapping out the process kernel stacks.
>>> > # This avoids processes for interacting with the system from being
>>> > # hung-up.
>>> > vm.swap_enabled=0
>>> > vm.swap_idle_enabled=0
>>> >
>>> > This allows paging to the swap space but disallows moving
>>> > kernel thread stacks to the swap space. Otherwise the
>>> > processes used to interact with the RPi3 can become
>>> > non-runnable, preventing such interactions.
>>> >
>>> > Okay this too is well noted.
>>> >
>>> >
>>> > I have NVMe or SSD based USB media, not microsd cards nor
>>> > spinning rust. (I use just bootcode.bin and timeout files
>>> > on microsd media for the RPi3B. Even the rest of the RPi*
>>> > firmware is on the USB media, as well as u-boot.bin .)
>>>
>>> This may contribute to why I've never gotten a "a thread
>>> waited too long to allocate a page" on any system. (Some
>>> systems, while bootable via USB3 media I have, also have
>>> have even faster internal media that is normally used.)
>>>
>>
>> Alright so there's significance.
>>
>>
>>>
>>> > My usage of such a configuration struture for building
>>> > software (world, kernel, ports) applies to all the
>>> > systems I do such with, including ones with a lot more
>>> > resources, including a lot more RAM.
>>> >
>>> > Thanks for these inputs, noted on these things! I haven't tried NVMe
>>> and SSD media in my RPi 3B. So, they are far more superior as compared to
>>> microSD cards when it comes to building software?
>>>
>>> My understanding is that microsd card media is fairly
>>> generally not as good for such contexts: slower, fails
>>> sooner, etc.
>>>
>>
>> I'll take note of this one as I may encounter those attributes along the
>> course of building software. It's something that I need to explore and do
>> some research ahead.
>>
>>
>>>
>>> I happen to boot multiple types of machines from the
>>> same media so I use USB3 media that is compatible with
>>> USB2 use, a single such USB3 device not needing a
>>> powered hub for use on the likes of an RPi3B. (Lots
>>> of USB3 media around would require external power for
>>> USB2 or an RPi3B use.) I need a powered hub for 2 or
>>> more such media on a RPi3B.
>>>
>>
>> Okay, that's right.  In my experience, inserting some devices tends to
>> reset the 4 USB ports' power, thus to prevent such behavior needs a
>> self-powered hub.
>>
>>
> Hi Mark,
>
> Just an update, as kernel and world compilation is ongoing with my RPi3B
> system (with swap partition) is doing so far, so good. It already surpassed
> the tough part that breaks the compilation process here.
> ...
>
> llvm-tblgen -gen-asm-matcher  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenAsmMatcher.inc.d -o RISCVGenAsmMatcher.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-asm-writer  -I /usr/src/contrib/llvm-project/llvm/include
> -I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenAsmWriter.inc.d -o RISCVGenAsmWriter.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-callingconv  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenCallingConv.inc.d -o RISCVGenCallingConv.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-compress-inst-emitter  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenCompressInstEmitter.inc.d -o RISCVGenCompressInstEmitter.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-dag-isel  -I /usr/src/contrib/llvm-project/llvm/include
> -I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenDAGISel.inc.d -o RISCVGenDAGISel.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-disassembler  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenDisassemblerTables.inc.d -o RISCVGenDisassemblerTables.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-global-isel  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenGlobalISel.inc.d -o RISCVGenGlobalISel.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-instr-info  -I /usr/src/contrib/llvm-project/llvm/include
> -I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenInstrInfo.inc.d -o RISCVGenInstrInfo.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-emitter  -I /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenMCCodeEmitter.inc.d -o RISCVGenMCCodeEmitter.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-pseudo-lowering  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenMCPseudoLowering.inc.d -o RISCVGenMCPseudoLowering.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-register-bank  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenRegisterBank.inc.d -o RISCVGenRegisterBank.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-register-info  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenRegisterInfo.inc.d -o RISCVGenRegisterInfo.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-searchable-tables  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenSearchableTables.inc.d -o RISCVGenSearchableTables.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-subtarget  -I /usr/src/contrib/llvm-project/llvm/include
> -I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenSubtargetInfo.inc.d -o RISCVGenSubtargetInfo.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-searchable-tables  -I
> /usr/src/contrib/llvm-project/llvm/include -I
> /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV  -d
> RISCVGenSystemOperands.inc.d -o RISCVGenSystemOperands.inc
>  /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
>
> Any thoughts why this part is quite a challenge when it comes to memory
> usage? The other architectures do not possess such behavior... just curious.
>
> Thanks and best regards,
> Archimedes
>
>
>


-- 
Nuno Teixeira
FreeBSD Committer (ports)