Re: Armv7 (rpi2) getting stuck in buildworld for -current
Date: Sun, 21 May 2023 07:55:53 UTC
On May 20, 2023, at 11:59, Mark Millard <marklmi@yahoo.com> wrote: > I set up the RPi2B v1.1 and started a -j4 buildworld buildkernel > from-scratch rebuild on/of: > > # uname -apKU # long output line split for readability > FreeBSD OPiP2E_RPi2v1p1 14.0-CURRENT FreeBSD 14.0-CURRENT #74 > main-n262658-b347c2284603-dirty: Fri Apr 28 23:07:41 PDT 2023 > root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7 > arm armv7 1400088 1400088 > > (The original build was done on another machine.) > > > At somewhat under 18 hrs it finished with the large swap > use during the overlapping time frame for these 4 builds: > > clang/libclang/CodeGen/CodeGenAction.o > clang/libclang/CodeGen/CodeGenFunction.o > clang/libclang/CodeGen/CodeGenModule.o > clang/libclang/CodeGen/CodeGenPGO.o > > These are my notes from the information for somewhat after > the swap use dropped off. > > > > My armv7 builds disable targeting other architectures but > I also have WITH_CLANG_EXTRAS= . (Not that the build has > gotten that far yet.) I show the controlling file content > later below. > > I used no assignments for: > > #vm.pfault_oom_attempts=-1 > #vm.pfault_oom_attempts= 10 > #vm.pfault_oom_wait= ??? > > but did/do have: > > vm.pageout_oom_seq=120 > > vm.swap_enabled=0 > vm.swap_idle_enabled=0 > > in use for this experiment. > > FYI: > make[1]: "/usr/main-src/Makefile.inc1" line 326: SYSTEM_COMPILER: Determined that CC=cc matches the source tree. Not bootstrapping a cross-compiler. > make[1]: "/usr/main-src/Makefile.inc1" line 331: SYSTEM_LINKER: Determined that LD=ld matches the source tree. Not bootstrapping a cross-linker. > > (Those 2 have significant time implications for the overall > build.) > > Based on (my modified) top, sampling every 3 seconds, > > Mem: . . ., > 754020Ki MaxObsActive, > 186756Ki MaxObsWired, > 923356Ki MaxObs(Act+Wir+Lndry) > > Swap: 1740Mi Total, . . ., > 756828Ki MaxObsUsed, > 1442Mi MaxObs(Act+Lndry+SwapUsed), > 1615Mi MaxObs(Act+Wir+Lndry+SwapUsed) > > So: slightly over 739 MiBytes of swap observed to have been > in use at one time. > > As for the overlapping time's duration: file creation and > modification times, in time order were: > > (via extraction from ls -TldU output:) > 09:37:28 creation of clang/libclang/CodeGen/CodeGenAction.o > 09:38:44 creation of clang/libclang/CodeGen/CodeGenFunction.o > 09:40:19 creation of clang/libclang/CodeGen/CodeGenModule.o > 09:41:28 creation of clang/libclang/CodeGen/CodeGenPGO.o > > (via extraction from ls -Tld output:) > 09:47:15 modification of clang/libclang/CodeGen/CodeGenFunction.o > 09:49:53 modification of clang/libclang/CodeGen/CodeGenAction.o > 09:50:10 modification of clang/libclang/CodeGen/CodeGenPGO.o > 09:54:49 modification of clang/libclang/CodeGen/CodeModule.o > > So: > > 09:41:28 . . . 09:47:15 (under 6 min) for the overlapping time > frame and the highest swap space use happened inside that > interval. > > During this, there were times mixes of CPUn and "swread" STATE > for the compiles. But at no point were all observed to be > blocked waiting, at no point was only 1 observed to show a CPUn > with a large WCPU. > > This is largely attributable to the USB media having tiny > latencies compared to spinning rust and having reasonable > transfer rates for the type of I/O: NMVe USB3 media (that is > also USB2 compatible for USB powered usage). > > My use of: > > # > # Delay when persistent low free RAM leads to > # Out Of Memory killing of processes: > vm.pageout_oom_seq=120 > > and: > > # > # Together this pair avoids swapping out the process kernel stacks. > # This avoids processes for interacting with the system from being > # hung-up. > vm.swap_enabled=0 > vm.swap_idle_enabled=0 > > did not lead to any problems so far. > > > For reference: > > Via systat -vmstat I monitored . . . > > VN PAGER SWAP PAGER > in out in out > count > pages > ioflt . . . > . . . > intrn . . . > > Both VN in and SWAP in can contribute to ioflt, faults > that required I/O. The ioflt number would be before the > "ioflt" text. > > There is a later line that lists intrn (somewhat below > ioflt): "in-transit blocking page faults". The intrn > number would be before the "intrn" text. > > The figures varied under 600 for ioflt and intrn for > what I saw during the large swap space use, with > matching SWAP activity, no significant VN activity. > (The figures are for an about 5 second update interval, > as I remember.) (I watched the on screen updates. > I did not try to capture the material in a file.) > > I expect that these figures would be large for a > sustained period in your context. > > > I also monitored with "gstat -spod". I assume that stat is > more familiar. (I use -spod even when "d" happens to not be > going to show any activity.) > > > [I do not recommend leaving "systat -swap" running: it > accumulates a large set of memory leaks and so can > mess up tracking swap space use by being a signficant > contributor. I did not put it to significant use other > than discovering that problem.] > > > Configuration points . . . > > > /boot/efi/config.txt has: > > enable_uart=1 > dtoverlay=mmc > # > # Local addition that avoids (at least) USB3 SSD boot failures that look like: > # uhub_reattach_port: port ? reset failed, error=USB_ERR_TIMEOUT > # uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port ? > initial_turbo=60 > # > # Local additions: > uart_2ndstage=1 > dtdebug=1 > kernel=u-boot.bin.2023.01.armv7 > kernel7=u-boot.bin.2023.01.armv7 > dtoverlay=disable-bt > # > force_turbo=1 > > ( /etc/rc.conf has powerd commented out. ) > (I build u-boot with a couple of settings added.) > (Leaving initial_turbo in place allows disabling > force_turbo independently --but still allowing > the USB booting to work during the temporary > turbo status. intial_turbo is not required when > force_turbo is enabled --but does not hurt.) > > /boot/loader.conf has : > > # > # Delay when persistent low free RAM leads to > # Out Of Memory killing of processes: > vm.pageout_oom_seq=120 > # > # For plunty of swap/paging space (will not > # run out), avoid pageout delays leading to > # Out Of Memory killing of processes: > #vm.pfault_oom_attempts=-1 > # > # For possibly insufficient swap/paging space > # (might run out), increase the pageout delay > # that leads to Out Of Memory killing of > # processes: > #vm.pfault_oom_attempts= 10 > #vm.pfault_oom_wait= ??? > # (The multiplication is the total but there > # are other potential tradoffs in the factors > # multiplied, even for nearly the same total.) > > (As I understand you are now using defaults for > vm.pfault_oom_attempts and vm.pfault_oom_wait . > So I did as well for those 2 for this experiment.) > > > /etc/sysctl.conf has: > > # > # Together this pair avoids swapping out the process kernel stacks. > # This avoids processes for interacting with the system from being > # hung-up. > vm.swap_enabled=0 > vm.swap_idle_enabled=0 > > > # more ~/src.configs/src.conf.CA7-nodbg-clang-alt.aarch64-host > TO_TYPE=armv7 > # > KERNCONF=GENERIC-NODBG-CA7 > TARGET=arm > .if ${.MAKE.LEVEL} == 0 > TARGET_ARCH=${TO_TYPE} > .export TARGET_ARCH > .endif > # > WITH_SYSTEM_COMPILER= > WITH_SYSTEM_LINKER= > # > WITH_ELFTOOLCHAIN_BOOTSTRAP= > #Disables avoiding bootstrap: WITHOUT_LLVM_TARGET_ALL= > WITHOUT_LLVM_TARGET_AARCH64= > WITH_LLVM_TARGET_ARM= > WITHOUT_LLVM_TARGET_MIPS= > WITHOUT_LLVM_TARGET_POWERPC= > WITHOUT_LLVM_TARGET_RISCV= > WITHOUT_LLVM_TARGET_X86= > WITH_CLANG= > WITH_CLANG_IS_CC= > WITH_CLANG_FULL= > WITH_CLANG_EXTRAS= > WITH_LLD= > WITH_LLD_IS_LD= > # > WITH_LLDB= > # > WITH_BOOT= > # > WITHOUT_WERROR= > MALLOC_PRODUCTION= > WITH_MALLOC_PRODUCTION= > WITHOUT_ASSERT_DEBUG= > WITHOUT_LLVM_ASSERTIONS= > # > # Avoid stripping but do not control host -g status as well: > DEBUG_FLAGS+= > # > WITH_REPRODUCIBLE_BUILD= > WITH_DEBUG_FILES= > # > XCFLAGS+= -mcpu=cortex-a7 > XCXXFLAGS+= -mcpu=cortex-a7 > # There is no XCPPFLAGS but XCPP gets XCFLAGS content. > > (An armv7 host does not need differing content than an > aarch64 host, thus the use of the *.aarch64-host file.) > > However long the overall build ends up taking, the above > is part of why the details end up as they will end up. > > > /etc/crontab notes: > > I do not know if you leave the following enabled during the long > builds or not ( from /etc/crontab ): > > # Perform daily/weekly/monthly maintenance. > 1 3 * * * root periodic daily > 15 4 * * 6 root periodic weekly > 30 5 1 * * root periodic monthly > > that can run things like "/usr/local/sbin/pkg check -qsa" > (daily example) that would compete for resources. I left them > active, so daily competed with the build for a while, but it > did not happen to overlap with the high swapspace use time > frame. > > I commonly disable these for builds that will span into the > hours it indicates, at least when I'm monitoring builds for > comparisons and such. > FYI: the buildworld just completed a bit ago: World built in 117913 seconds, ncpu: 4, make -j4 So, somewhat under 33 hrs for what and how I build, given the media I use. The buildkernel is in process. === Mark Millard marklmi at yahoo.com