Re: armv8.2-A+ tuned FreeBSD kernels and buildworld buildkernel times: an example
- In reply to: Mark Millard : "armv8.2-A+ tuned FreeBSD kernels and buildworld buildkernel times: an example"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 29 Apr 2023 21:09:17 UTC
On Apr 29, 2023, at 12:16, Mark Millard <marklmi@yahoo.com> wrote: > Context: all world's and kernel's involved/built are non-debug style. > > Note: clang15 through LLVM main (so far) has errors in both directions > for the features for cortex-a78c. So I also used +flagm+nofp16fml . > (The cortex-x1c also has such problems, but the details are > different.) > > Notation in table below: > CA72: matching world or kernel had been built using -mcpu=cortex-a72 > CA78C: matching world or kernel had been built using -mcpu=cortex-a78C+flagm+nofp16fml > > System: Windows Dev Kit 2023 (4 cortex-a78c's and 4 cortex-x1c's): > (both: armv8.2-A with a few more modern features) > > Times to build system from scratch (buildworld buildkernel from same > sources) . . . > > System running: World built in: kernel built in: > CA72 kernel, CA72 world 6601 sec 597 sec > CA78C kernel, CA78C world 4680 sec 413 sec > CA78C kernel, CA72 world (chroot) 4715 sec 422 sec > > The CA72/CA72 is from before I'd built the CA78C world and kernel. > All builds used -j8 . None had competing activity on the machine. > > What this suggests is having an explicitly armv8.2+ tuned kernel > makes a notable difference for -j8 buildworld buildkernel times > on aarch64. "Tuned" here includes newer-feature use, so incompatible with the likes of armv8.0-A hardware, for example. The FEAT_LSE atomics use would be an example. But I've done nothing to investigate subsetting the new-feature use to isolate what makes the biggest contributions to the elapsed-time decrease. > The Windows Dev Kit 2023 is the first (and only) armv8.1+ based > system that I've have access to. So testing such properties is > limited to the one context. > > Also, I've not had access to the Windows Dev Kit 2023 for long: > first experiments. > > > Notes on my historically-usual aarch64 builds: > > On cortex-a72 hardware, my context is -mcpu=cortex-a72 based. This > once exposed a lack of sufficient synchronization in a palce in > the USB subsystem. (Running the same system on cortex-a53 hardware > did not fail. Running -mcpu=cortex-a53 based world+kernel on a > cortex-a72 did not fail. A cortex-a53 hardware running the > -mcpu=cortex-a53 based world+kernel did not fail.) > > Until the hardware failed, there was a time when I also had > access to a cortex-a57 FreeBSD system. > > I do not do such -mcpu= tailoring on the only FreeBSD amd64 that > I've access to, a ThreadRipper 1950X. I do such only for the lower > end systems that I have access to. My aarch64 access is all to > lower end, not upper end. I should have reported that my recent activity for this is based on: main-n262658-b347c2284603-dirty, b347c2284603 being from late Apr 28, 2023 UTC. (The "-dirty" is from some historical patches that I use.) Some of my activity has been from somewhat earlier but I wanted to pick up another openzfs fix nor 2 that had happened since then.) === Mark Millard marklmi at yahoo.com