Re: FreeBSD ports community is broken [port building configuration notes]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 11 Mar 2024 15:50:33 UTC
[The armv7 poudriere bulk finished.]

On Mar 10, 2024, at 13:10, Mark Millard <marklmi@yahoo.com> wrote:

> [poudriere bulk status update.]
> 
> On Mar 5, 2024, at 18:43, Mark Millard <marklmi@yahoo.com> wrote:
> 
>> [I noticed that my SWAP figures were not self consistent for the armv7.]
>> 
>> On Feb 18, 2024, at 09:50, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>>> [I also forgot to mention an important FreeBSD configuration setting
>>> as well. It is not specific to poudriere use.]
>>> 
>>>> On Feb 18, 2024, at 09:13, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>> [I forgot to mention the armv7 core count involved: 4]
>>>> 
>>>> On Feb 18, 2024, at 08:52, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>>> Aryeh Friedman <aryehfriedman_at_gmail.com> wrote on
>>>>> Date: Sun, 18 Feb 2024 10:37:06 UTC :
>>>>> 
>>>>>> It should not require
>>>>>> prodiere running on a supermassive machine to work (in many cases
>>>>>> portmaster and make install recursion fail where prodiere works).
>>>>> 
>>>>> As for configuring for small, slow systems relative to
>>>>> resource use, I provide some settings that I've
>>>>> historically used below. Then I have some other notes
>>>>> after that material.
>>>>> 
>>>>> For a 2 GiByte RAM armv7 system with 3 GiByte swap space
>>>>> and a UFS file system, no use of tmpfs in normal operation
>>>>> (since it competes for RAM+SWAP generally):
>> 
>> Actually: 2 GiByte RAM armv7 has 3.6 GiByte SWAP space, with
>> some margin. Ever so slightly over 3.8 GiBytes got the mistuning
>> warning but there is variability across builds so I try to avoid
>> repeated adjustments by picking somewhat smaller.
>> 
>>>> FYI: The armv7 has 4 cores.
>>>> 
>>>>> /usr/local/etc/poudriere.conf has . . .
>>>>> 
>>>>> NO_ZFS=yes
>>>>> USE_TMPFS=no
>>>>> PARALLEL_JOBS=2
>>>>> ALLOW_MAKE_JOBS=yes
>>>>> MAX_EXECUTION_TIME=432000
>>>>> NOHANG_TIME=432000
>>>>> MAX_EXECUTION_TIME_EXTRACT=14400
>>>>> MAX_EXECUTION_TIME_INSTALL=14400
>>>>> MAX_EXECUTION_TIME_PACKAGE=57600
>>>>> MAX_EXECUTION_TIME_DEINSTALL=14400
>>>>> 
>>>>> /usr/local/etc/poudriere.d/make.conf has . . .
>>>>> 
>>>>> MAKE_JOBS_NUMBER=2

I'll note that I'd switched to using MAKE_JOB_NUMBER_LIMIT
and do not use MAKE_JOB_NUMBER any more. So:

MAKE_JOB_NUMBER_LIMIT=2

>>>>> /etc/fstab does not specify any tmpfs use or the
>>>>> like: avoids competing for RAM+SWAP.
>>>>> 
>>>>> The 3 GiBytes of swap space is deliberate: RAM+SWAP
>>>>> is important for all means of building in such a
>>>>> context: there are a bunch of ports that have
>>>>> large memory use for building in all cases.
>>>>> 
>>>>> [armv7 allows around RAM+SWAP=2.5*RAM before
>> 
>> That equation should have been RAM+SWAP==2.8*RAM
>> (with margin considered), so SWAP==1.8*RAM. (With
>> a small enough RAM 2.7*RAM might need to be used,
>> for example.)
>> 
>> So the 2 GiByte RAM leads to a 5.6 GiByte RAM+SWAP
>> for the builders and other uses to share.
>> 
>> I may set up a modern experiment to see if the
>> combination:
>> 
>> PARALLEL_JOBS=2
>> ALLOW_MAKE_JOBS=yes (with MAKE_JOBS_NUMBER=2)

Again, now: MAKE_JOB_NUMBER_LIMIT=2

>> still completes for a build that would end up with
>> llvm18 and rust likely building in parallel for
>> much of the time (if it completed okay, anyway).
>> Something like 265 ports would be queued, the last
>> few of which include some use of llvm18 and of
>> rust.
>> 
>> . . .
>> 
>>>>> tradeoff/mistuning notices are generated. aarch64
>>>>> and amd64 allow more like RAM+SWAP=3.4*RAM before

I've not validated the 3.4 figure. It is likely a bit low.

>>>>> such notices are reported. The detailed multiplier
>>>>> changes some from build to build, so I leave
>>>>> margin in my figures to avoid the notices.]
>>>>> 
>>>>> I also historically use USB SSD/NVMe media, no
>>>>> spinning rust, no microsd cards or such.
>>> 
>>> /boot/loader.conf has . . .
>>> 
>>> #
>>> # Delay when persistent low free RAM leads to
>>> # Out Of Memory killing of processes:
>>> vm.pageout_oom_seq=120
>>> 
>>> This is important to allowing various things
>>> to complete. (The default is 12. 120 is not
>>> the maximum but has been appropriate in my
>>> context. The figure is not in time units but
>>> larger increases the observed delay so more
>>> work gets done before OOM activity starts.)
>>> 
>>> Using vm.pageout_oom_seq is not specific to
>>> poudriere use.
>>> 
>>>>> As far as more ports building in poudriere than in
>>>>> "portmaster and make install recursion" in other
>>>>> respects than resources: it is easier to make ports
>>>>> build in poudriere. It provides the simpler/cleaner
>>>>> context for the individual builders. More things
>>>>> lead to failure outside poudriere that are just not
>>>>> issues when poudriere is used so more care is needed
>>>>> setting up the ports for the likes of portmaster use.
>>>>> (And, yes, I used to use portmaster.) The required
>>>>> range of testing contexts is wider for use of the
>>>>> likes of portmaster to know that the port build will
>>>>> just work in the full range of contexts.
>>>>> 
>>>>> Such issues adds to the port maintainer/committer
>>>>> development burdens when portmaster or the like are
>>>>> the target level/type of support.
>>>>> 
>>>>> (Note: synth may be more like poudriere for this
>>>>> but I've historically had use of platforms that
>>>>> synth did not support and so have not looked into
>>>>> the details.)
> 
> Context: 1GHz, 4 core, cortex-a7 (armv7), 2 GiBytes RAM, USB2.
> RAM+SWAP: 5.6 GiBytes. Also, this is doing my normal armv7 (and
> aarch64) style of devel/llvm* build: OPTION'd to BE_NATIVE
> instead of BE_STANDARD and OPTION'd to not build MLIR.

Also: For armv7 I use -mcpu=cortex-a7 most everywhere for
each of: port builds, the world in the poudriere jail
directory, the booted kernel+world. (All armv7 contexts
that I've access to support cortex-a7 user space code.)

In poudriere.conf I used the likes of:

PRIORITY_BOOST="cmake-core llvm18 boost-libs gcc-arm-embedded"

and probably should have listed rust after llvm18 as well,
making it more likely that the 2 builders will run in
parallel much of the time (less elapsed time): See the
later summary time frames.

> The poudriere bulk has finished llvm18 and rust, 
. . . updating the related material:

It finished overall, in somewhat under 5.5 days. The "what
builds took over an hour" summary is:

[01:51:31] [01] [01:00:07] Finished lang/perl5.36 | perl5-5.36.3_1: Success
[08:55:35] [02] [03:08:09] Finished devel/icu | icu-74.2,1: Success
[13:17:38] [02] [01:28:32] Finished lang/ruby31 | ruby-3.1.4_1,1: Success
[14:17:44] [01] [09:20:55] Finished devel/cmake-core | cmake-core-3.28.3: Success
[4D:01:03:43] [02] [3D:08:48:53] Finished lang/rust | rust-1.76.0: Success
[4D:06:26:24] [02] [03:09:35] Finished devel/binutils@native | binutils-2.40_5,1: Success
[4D:14:54:31] [02] [03:38:55] Finished devel/aarch64-none-elf-gcc | aarch64-none-elf-gcc-11.3.0_3: Success
[4D:16:13:00] [01] [4D:01:55:03] Finished devel/llvm18@default | llvm18-18.1.0.r3: Success
[4D:18:05:58] [02] [03:11:00] Finished devel/arm-none-eabi-gcc | arm-none-eabi-gcc-11.3.0_3: Success
[4D:23:00:13] [01] [06:46:06] Finished devel/boost-libs | boost-libs-1.84.0: Success
[5D:00:16:39] [01] [01:15:53] Finished textproc/source-highlight | source-highlight-3.1.9_9: Success
[5D:01:17:24] [02] [07:10:52] Finished lang/gcc13 | gcc13-13.2.0_4: Success
[5D:09:38:14] [01] [05:56:48] Finished devel/freebsd-gcc13@armv7 | armv7-gcc13-13.2.0_1: Success
[5D:10:18:58] [02] [05:44:02] Finished devel/gdb@py39 | gdb-14.1_2: Success
[5D:10:31:56] Stopping 2 builders
[main-CA7-default] [2024-03-06_03h15m10s] [committing] Queued: 265 Built: 265 Failed: 0   Skipped: 0   Ignored: 0   Fetched: 0   Tobuild: 0    Time: 5D:10:31:55

> 
> So: llvm18 started before rust and finished after rust, each
> mostly using 2 hardware threads. About the last 1.5 hr for
> llvm18 was llvm18 being packaged, after somewhat over 96 hours
> of mostly 2 hardware threads working on it. The vast majority
> of the build time was for the build phase.
> 
> I have a modified top that monitors and reports some "MAXimum
> OBServed" figures (MaxObsYYY figures). As of llvm18 finishing,
> that top was reporting:
> 
> 2794Mi MaxObs(Act+Wir+Lndry+SwapUsed)
> (Inact can be an arbitrary mix of dirty and clean pages and,
> so, is not included.)
> 
> Swap: 995524Ki MaxObsUsed

The MaxObs figures reported did not change.

> Thus, it used up to around half of the RAM+SWAP to get that
> far. (Rust and llvm18's peak RAM+SWAP usages need not have
> been over the same time period. But there was RAM+SWAP room
> for a larger overall peak.)
> 
> [Note: The peak RAM+SWAP use was during a period of llvm18's
> build running various llvm-tblgen examples.]
> 
> As stands, it looks like the poudriere bulk run will complete
> just fine for the configuration that I specified, with margin
> for variations in peak RAM+SWAP usage.
> 

It competed just fine.


===
Mark Millard
marklmi at yahoo.com