Re: Armv7 (rpi2) getting stuck in buildworld for -current

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sat, 20 May 2023 18:59:51 UTC
I set up the RPi2B v1.1 and started a -j4 buildworld buildkernel
from-scratch rebuild on/of:

# uname -apKU # long output line split for readability
FreeBSD OPiP2E_RPi2v1p1 14.0-CURRENT FreeBSD 14.0-CURRENT #74
main-n262658-b347c2284603-dirty: Fri Apr 28 23:07:41 PDT 2023
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7
arm armv7 1400088 1400088

(The original build was done on another machine.)


At somewhat under 18 hrs it finished with the large swap
use during the overlapping time frame for these 4 builds:

clang/libclang/CodeGen/CodeGenAction.o
clang/libclang/CodeGen/CodeGenFunction.o
clang/libclang/CodeGen/CodeGenModule.o
clang/libclang/CodeGen/CodeGenPGO.o

These are my notes from the information for somewhat after
the swap use dropped off.



My armv7 builds disable targeting other architectures but
I also have WITH_CLANG_EXTRAS= . (Not that the build has
gotten that far yet.) I show the controlling file content
later below.

I used no assignments for:

#vm.pfault_oom_attempts=-1
#vm.pfault_oom_attempts= 10
#vm.pfault_oom_wait= ???

but did/do have:

vm.pageout_oom_seq=120

vm.swap_enabled=0
vm.swap_idle_enabled=0

in use for this experiment.

FYI:
make[1]: "/usr/main-src/Makefile.inc1" line 326: SYSTEM_COMPILER: Determined that CC=cc matches the source tree.  Not bootstrapping a cross-compiler.
make[1]: "/usr/main-src/Makefile.inc1" line 331: SYSTEM_LINKER: Determined that LD=ld matches the source tree.  Not bootstrapping a cross-linker.

(Those 2 have significant time implications for the overall
build.)

Based on (my modified) top, sampling every 3 seconds,

Mem: . . .,
      754020Ki MaxObsActive,
      186756Ki MaxObsWired,
      923356Ki MaxObs(Act+Wir+Lndry)

Swap: 1740Mi Total, . . .,
       756828Ki MaxObsUsed,
       1442Mi MaxObs(Act+Lndry+SwapUsed),
       1615Mi MaxObs(Act+Wir+Lndry+SwapUsed)

So: slightly over 739 MiBytes of swap observed to have been
in use at one time.

As for the overlapping time's duration: file creation and
modification times, in time order were:

(via extraction from ls -TldU output:)
09:37:28 creation     of clang/libclang/CodeGen/CodeGenAction.o
09:38:44 creation     of clang/libclang/CodeGen/CodeGenFunction.o
09:40:19 creation     of clang/libclang/CodeGen/CodeGenModule.o
09:41:28 creation     of clang/libclang/CodeGen/CodeGenPGO.o

(via extraction from ls -Tld output:)
09:47:15 modification of clang/libclang/CodeGen/CodeGenFunction.o
09:49:53 modification of clang/libclang/CodeGen/CodeGenAction.o
09:50:10 modification of clang/libclang/CodeGen/CodeGenPGO.o
09:54:49 modification of clang/libclang/CodeGen/CodeModule.o

So:

09:41:28 . . . 09:47:15 (under 6 min) for the overlapping time
frame and the highest swap space use happened inside that
interval.

During this, there were times mixes of CPUn and "swread" STATE
for the compiles. But at no point were all observed to be
blocked waiting, at no point was only 1 observed to show a CPUn
with a large WCPU.

This is largely attributable to the USB media having tiny
latencies compared to spinning rust and having reasonable
transfer rates for the type of I/O: NMVe USB3 media (that is
also USB2 compatible for USB powered usage).

My use of:

#
# Delay when persistent low free RAM leads to
# Out Of Memory killing of processes:
vm.pageout_oom_seq=120

and:

#
# Together this pair avoids swapping out the process kernel stacks.
# This avoids processes for interacting with the system from being
# hung-up.
vm.swap_enabled=0
vm.swap_idle_enabled=0

did not lead to any problems so far.


For reference:

Via systat -vmstat I monitored . . .

         VN PAGER   SWAP PAGER
         in   out     in   out
count     
pages
           ioflt  . . .
. . .
           intrn       . . .

Both VN in and SWAP in can contribute to ioflt, faults
that required I/O. The ioflt number would be before the
"ioflt" text.

There is a later line that lists intrn (somewhat below
ioflt): "in-transit blocking page faults". The intrn
number would be before the "intrn" text.

The figures varied under 600 for ioflt and intrn for
what I saw during the large swap space use, with
matching SWAP activity, no significant VN activity.
(The figures are for an about 5 second update interval,
as I remember.) (I watched the on screen updates.
I did not try to capture the material in a file.)

I expect that these figures would be large for a
sustained period in your context.


I also monitored with "gstat -spod". I assume that stat is
more familiar. (I use -spod even when "d" happens to not be
going to show any activity.)


[I do not recommend leaving "systat -swap" running: it
accumulates a large set of memory leaks and so can
mess up tracking swap space use by being a signficant
contributor. I did not put it to significant use other
than discovering that problem.]


Configuration points . . .


/boot/efi/config.txt has:

enable_uart=1
dtoverlay=mmc
#
# Local addition that avoids (at least) USB3 SSD boot failures that look like:
#   uhub_reattach_port: port ? reset failed, error=USB_ERR_TIMEOUT
#   uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port ?
initial_turbo=60
#
# Local additions:
uart_2ndstage=1
dtdebug=1
kernel=u-boot.bin.2023.01.armv7
kernel7=u-boot.bin.2023.01.armv7
dtoverlay=disable-bt
#
force_turbo=1

( /etc/rc.conf has powerd commented out. )
(I build u-boot with a couple of settings added.)
(Leaving initial_turbo in place allows disabling
force_turbo independently --but still allowing
the USB booting to work during the temporary
turbo status. intial_turbo is not required when
force_turbo is enabled --but does not hurt.)

/boot/loader.conf has :

#
# Delay when persistent low free RAM leads to
# Out Of Memory killing of processes:
vm.pageout_oom_seq=120
#
# For plunty of swap/paging space (will not
# run out), avoid pageout delays leading to
# Out Of Memory killing of processes:
#vm.pfault_oom_attempts=-1
#
# For possibly insufficient swap/paging space
# (might run out), increase the pageout delay
# that leads to Out Of Memory killing of
# processes:
#vm.pfault_oom_attempts= 10
#vm.pfault_oom_wait= ???
# (The multiplication is the total but there
# are other potential tradoffs in the factors
# multiplied, even for nearly the same total.)

(As I understand you are now using defaults for
vm.pfault_oom_attempts and vm.pfault_oom_wait .
So I did as well for those 2 for this experiment.)


/etc/sysctl.conf has:

#
# Together this pair avoids swapping out the process kernel stacks.
# This avoids processes for interacting with the system from being
# hung-up.
vm.swap_enabled=0
vm.swap_idle_enabled=0


# more ~/src.configs/src.conf.CA7-nodbg-clang-alt.aarch64-host 
TO_TYPE=armv7
#
KERNCONF=GENERIC-NODBG-CA7
TARGET=arm
.if ${.MAKE.LEVEL} == 0
TARGET_ARCH=${TO_TYPE}
.export TARGET_ARCH
.endif
#
WITH_SYSTEM_COMPILER=
WITH_SYSTEM_LINKER=
#
WITH_ELFTOOLCHAIN_BOOTSTRAP=
#Disables avoiding bootstrap: WITHOUT_LLVM_TARGET_ALL=
WITHOUT_LLVM_TARGET_AARCH64=
WITH_LLVM_TARGET_ARM=
WITHOUT_LLVM_TARGET_MIPS=
WITHOUT_LLVM_TARGET_POWERPC=
WITHOUT_LLVM_TARGET_RISCV=
WITHOUT_LLVM_TARGET_X86=
WITH_CLANG=
WITH_CLANG_IS_CC=
WITH_CLANG_FULL=
WITH_CLANG_EXTRAS=
WITH_LLD=
WITH_LLD_IS_LD=
#
WITH_LLDB=
#
WITH_BOOT=
#
WITHOUT_WERROR=
MALLOC_PRODUCTION=
WITH_MALLOC_PRODUCTION=
WITHOUT_ASSERT_DEBUG=
WITHOUT_LLVM_ASSERTIONS=
#
# Avoid stripping but do not control host -g status as well:
DEBUG_FLAGS+=
#
WITH_REPRODUCIBLE_BUILD=
WITH_DEBUG_FILES=
#
XCFLAGS+= -mcpu=cortex-a7
XCXXFLAGS+= -mcpu=cortex-a7
# There is no XCPPFLAGS but XCPP gets XCFLAGS content.

(An armv7 host does not need differing content than an
aarch64 host, thus the use of the *.aarch64-host file.)

However long the overall build ends up taking, the above
is part of why the details end up as they will end up.


/etc/crontab notes:

I do not know if you leave the following enabled during the long
builds or not ( from /etc/crontab ):

# Perform daily/weekly/monthly maintenance.
1       3       *       *       *       root    periodic daily
15      4       *       *       6       root    periodic weekly
30      5       1       *       *       root    periodic monthly

that can run things like "/usr/local/sbin/pkg check -qsa"
(daily example) that would compete for resources. I left them
active, so daily competed with the build for a while, but it
did not happen to overlap with the high swapspace use time
frame.

I commonly disable these for builds that will span into the
hours it indicates, at least when I'm monitoring builds for
comparisons and such.

===
Mark Millard
marklmi at yahoo.com