Re: Call for Foundation-supported Project Ideas (buildworld buildkernel time issue)
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 30 Nov 2021 03:04:41 UTC
From: Steve Kargl <sgk_at_troutmask.apl.washington.edu> wrote on Date: Sun, 28 Nov 2021 14:07:32 -0800 > 1.) Replace clang with something/anything that is more performant. > Going on day 3 of "make buildworld". Still in the lib/clang/libclang > directory. Just an FYI for comparison: An appropriately configured 8GiByte RPi4B builds such in much less time than that: under 10 hours. Building the system llvm materials is included in the measured example below, but not a bootstrap compiler or linker. (This is the type of build example I give below because it was also handy for something I want to do.) I'd not call a well-configured 8 GiByte RPi4B high-end these days. But, nor is it low end as far as small board computers go. (Hardware like the MACCHIATObin Double Shot [4 Cortext-A72 cores, 16 GiBytes of RAM installed] and the old OverDrive 1000 [4 Cortext-A57 cores, 8 GiBytes of RAM installed] are/were not SBCs and take/took noticeably less time based mostly on a more performant RAM + RAM-caching implementation from what I've seen. The slower clock rate and older Cortex variant in the OverDrive 1000 historicially took the least time of the 3, again mostly for RAM + RAM-caching tied performance reasons from what I saw.) The following is for a from-scratch debug build of main [so: 14] being built by a non-debug system that was built from the same source. Thus the WITH_META_MODE= that is in use adds some overhead to the specific build. It is an example where the system compiler and linker are built only once: bootstrapping copies are not built. That would add some time but is not needed often. (I've no clue if your 2+ day build built a bootstrap compiler and/or linker or not.) --- buildworld --- make[1]: "/usr/main-src/Makefile.inc1" line 340: SYSTEM_COMPILER: Determined that CC=cc matches the source tree. Not bootstrapping a cross-compiler. make[1]: "/usr/main-src/Makefile.inc1" line 345: SYSTEM_LINKER: Determined that LD=ld matches the source tree. Not bootstrapping a cross-linker. It is a -j4 build (there are 4 cores in the RPi4B). buildworld time: World build completed on Mon Nov 29 18:12:55 PST 2021 World built in 23919 seconds, ncpu: 4, make -j4 So: somewhat under 6.7 hours. buildkernel time: Kernel build for GENERIC-DBG-CA72 completed on Mon Nov 29 18:40:44 PST 2021 Kernel(s) GENERIC-DBG-CA72 built in 1669 seconds, ncpu: 4, make -j4 So: somewhat under 0.5 hours. Total time: 23919 sec + 1669 sec == 25588 sec So: somewhat under 7.2 hours, but say under 10 hours to allow for some variation in what might be built and the like. For reference for the building environment: # uname -apKU FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #22 main-n250972-319e9fc642a1-dirty: Tue Nov 23 12:25:36 PST 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400042 1400042 Even my "nodbg" builds include debug information, despite being optimized. It is the kernel's debug features which have been disabled. I give the src.config configuration later. The various WITHOUT_LLVM_TARGET_*'s do save some time but not huge amounts of it relative to the times reported here --but I also do WITH_CLANG_EXTRAS= which adds some time. I buildworld and buildkernel with -mcpu=cortex-a72 involved, a type of thing I only do for lower end systems, not for something like a ThreadRipper 1950X. The build never used the swap space. My patched top (that tracks and reports various maximum-observed figures) reported: . . . Mem: . . . 2380Mi MaxObsActive, 3866Mi MaxObsWired, 4941Mi MaxObs(Act+Wir+Lndry) . . . Swap: 14336Mi Total, 14336Mi Free, 2380Mi MaxObs(Act+Lndry+SwapUsed), 4941Mi MaxObs(Act+Wir+Lndry+SwapUsed) (UFS tends to get very different Wired figures, and, so, also difference for various other figures.) The 8 GiByte RPi4B is using USB3 portable SSD media (a: T7 Touch). The media that I used is set up with root-on-ZFS (no UFS use) but historically root-on-UFS (no ZFS use) has not been a large variation. I could time via the UFS-based media if it is of interest (also T7 Touch media). The RPi4B has heat sinks and case with a fan. I use a CanaKit 5.1V 3.5A power supply. I have: over_voltage=6 arm_freq=2000 sdram_freq_min=3200 force_turbo=1 in the RPi4B's config.txt . These settings are ones that were set to work well with every RPi4B that I've used, with some margin. (All have heat sinks, a case with fan, and a 5.1V 3.5A power supply, so I've not tested other contexts.) The src.conf sort of material looks like: # more ~/src.configs/src.conf.CA72-dbg-clang.aarch64-host TO_TYPE=aarch64 TOOLS_TO_TYPE=${TO_TYPE} # KERNCONF=GENERIC-DBG-CA72 TARGET=arm64 .if ${.MAKE.LEVEL} == 0 TARGET_ARCH=${TO_TYPE} .export TARGET_ARCH .endif # #WITH_CROSS_COMPILER= WITH_SYSTEM_COMPILER= WITH_SYSTEM_LINKER= # #WITH_LLD_BOOTSTRAP= WITH_ELFTOOLCHAIN_BOOTSTRAP= #Disables avoiding bootstrap: WITHOUT_LLVM_TARGET_ALL= WITH_LLVM_TARGET_AARCH64= WITH_LLVM_TARGET_ARM= WITHOUT_LLVM_TARGET_MIPS= WITHOUT_LLVM_TARGET_POWERPC= WITHOUT_LLVM_TARGET_RISCV= WITHOUT_LLVM_TARGET_X86= #WITH_CLANG_BOOTSTRAP= WITH_CLANG= WITH_CLANG_IS_CC= WITH_CLANG_FULL= WITH_CLANG_EXTRAS= WITH_LLD= WITH_LLD_IS_LD= WITH_LLDB= # WITH_BOOT= # # WITHOUT_WERROR= #WERROR= #MALLOC_PRODUCTION= WITHOUT_MALLOC_PRODUCTION= WITH_ASSERT_DEBUG= WITH_LLVM_ASSERTIONS= # # Avoid stripping but do not control host -g status as well: DEBUG_FLAGS+= # WITH_REPRODUCIBLE_BUILD= WITH_DEBUG_FILES= # XCFLAGS+= -mcpu=cortex-a72 XCXXFLAGS+= -mcpu=cortex-a72 # There is no XCPPFLAGS but XCPP gets XCFLAGS content. ACFLAGS.arm64cpuid.S+= -mcpu=cortex-a72+crypto ACFLAGS.aesv8-armx.S+= -mcpu=cortex-a72+crypto ACFLAGS.ghashv8-armx.S+= -mcpu=cortex-a72+crypto (Comments about why specific options were not used for reasons of some odd consequence once observed may not have been checked in some time. Options commented out without such notes are just a simple choices, not driven by such oddities.) One thing that can slow down builds if there is rapid build output at times: serial console handling of that output. (Very noticeable for installworld and installkernel to a directory.) I used an ssh session to avoid the potential contribution to the time. The OverDrive 1000 died some time ago but I still have access to the MACHHIATObin Double Shot and I could run a timing test on it for building the same sources that same way. (Same Cortex-A72 clock rate in use as used in the RPi4B test: 2.0 GHz.) Hmm. The buildkernel got a bunch of: ERROR: ctfconvert: failed to get mapping for tid ????? <????> notices. I do not expect the issue changed the time much but note them in case. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)