armv8.2-A+ tuned FreeBSD kernels and buildworld buildkernel times: an example
Date: Sat, 29 Apr 2023 19:16:39 UTC
Context: all world's and kernel's involved/built are non-debug style. Note: clang15 through LLVM main (so far) has errors in both directions for the features for cortex-a78c. So I also used +flagm+nofp16fml . (The cortex-x1c also has such problems, but the details are different.) Notation in table below: CA72: matching world or kernel had been built using -mcpu=cortex-a72 CA78C: matching world or kernel had been built using -mcpu=cortex-a78C+flagm+nofp16fml System: Windows Dev Kit 2023 (4 cortex-a78c's and 4 cortex-x1c's): (both: armv8.2-A with a few more modern features) Times to build system from scratch (buildworld buildkernel from same sources) . . . System running: World built in: kernel built in: CA72 kernel, CA72 world 6601 sec 597 sec CA78C kernel, CA78C world 4680 sec 413 sec CA78C kernel, CA72 world (chroot) 4715 sec 422 sec The CA72/CA72 is from before I'd built the CA78C world and kernel. All builds used -j8 . None had competing activity on the machine. What this suggests is having an explicitly armv8.2+ tuned kernel makes a notable difference for -j8 buildworld buildkernel times on aarch64. The Windows Dev Kit 2023 is the first (and only) armv8.1+ based system that I've have access to. So testing such properties is limited to the one context. Also, I've not had access to the Windows Dev Kit 2023 for long: first experiments. Notes on my historically-usual aarch64 builds: On cortex-a72 hardware, my context is -mcpu=cortex-a72 based. This once exposed a lack of sufficient synchronization in a palce in the USB subsystem. (Running the same system on cortex-a53 hardware did not fail. Running -mcpu=cortex-a53 based world+kernel on a cortex-a72 did not fail. A cortex-a53 hardware running the -mcpu=cortex-a53 based world+kernel did not fail.) Until the hardware failed, there was a time when I also had access to a cortex-a57 FreeBSD system. I do not do such -mcpu= tailoring on the only FreeBSD amd64 that I've access to, a ThreadRipper 1950X. I do such only for the lower end systems that I have access to. My aarch64 access is all to lower end, not upper end. === Mark Millard marklmi at yahoo.com