Re: How to investigate an unexpected port build time-taken relationship in an aarch64 context?
Date: Thu, 29 Feb 2024 19:00:30 UTC
[The armv7 bulk build variation of that test proves to be an interesting contrast with a prior result.] On Feb 29, 2024, at 10:04, Mark Millard <marklmi@yahoo.com> wrote: > [A different test case also gets the shorter time frame.] > > On Feb 28, 2024, at 22:59, Mark Millard <marklmi@yahoo.com> wrote: > >> On Feb 28, 2024, at 18:46, Mark Millard <marklmi@yahoo.com> wrote: >> >>> Example HW Context: Windows Development Kit 2023 >>> 8 cores: 4 cortex-A78C's and 4 cortex-X1C's >>> Headless: serial console and ssh access, no x11 or the like installed. >>> UFS use. >>> >>> Note: cortex-A76's are missing 3 or so instruction set >>> features compared to A78C/X1C parts. Use of >>> -mcpu=cortex-a76 generated code is compatibile (and would allow >>> the code to run on a cortex-a76 system, such as an RPi5 once >>> supported). >>> >>> I've been doing poudriere-devel bulk timing experiments based on: >>> >>> A) PkgBase based system software (kernel and world) and >>> general use of default code generation for ports and >>> such. >>> >>> B) A personal -mcpu=cortex-a76 based kernel, world, port builds >>> (into packages via poudriere-devel). >>> >>> C) Also use of an armv7 poudriere jail based on armv7 PkgBase >>> and default armv7 code generation. This was used in both the >>> (A) and (B) contexts. These also show what I'm curious about. >>> >>> Using the armv7 poudriere jail context for illustration: >>> >>> For (B) used via the armv7 context: >>> >>> [05:40:24] [03] [04:55:38] Finished lang/rust | rust-1.76.0: Success >>> . . . >>> [05:45:58] [01] [05:01:12] Finished devel/llvm18@default | llvm18-18.1.0.r3: Success >>> [05:46:00] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0 >>> [06:59:23] [01] [01:13:23] Finished devel/boost-libs | boost-libs-1.84.0: Success >>> >>> For (A) used via the armv7 poudriere jail context: >>> >>> [06:33:21] [01] [05:40:48] Finished lang/rust | rust-1.76.0: Success >>> . . . >>> [06:40:05] [05] [05:48:09] Finished devel/llvm18@default | llvm18-18.1.0.r3: Success >>> [06:40:07] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0 >>> [06:57:48] [01] [00:17:41] Finished devel/boost-libs | boost-libs-1.84.0: Success >>> >>> The curiosity is about the 01:13:23 vs. 00:17:41 boost-libs: The >>> ratio is large and in the opposite direction to most time trends. >>> >>> Notes: Almost all the time llvm18 and rust were building, both were >>> building but little else did and the load average was 16+ from the >>> llvm18/rust build activity. When boost-libs was building it was the >>> only thing building and it looked to be single threaded when I >>> was watching. >> >> I should have been explicit that the 01:13:23 was mostly >> "stage" phase (not "build" phase) and I was referring to >> the "stage" phase as far as single threaded is concerned. >> >>> (A) and (B) without use of the armv7 context got similar results >>> when I first noticed this but I'm going back and recording times >>> for some variations. I do not have those to report other >>> pairs of results yet. >>> >>> (In the armv7 poudriere jail context reported:) >>> (B) takes less time for llvm18 and rust than (A) does. >>> (A) takes vastly less time for boost-libs than (B) does, >>> approximately a factor of 4 for the time-ratio. >>> >>> I'd be curious to get a clue what contributes to the boost-libs >>> time ratio being so extreme once I have figures for other >>> combinations of poudriere jail content vs. the system's content. >> >> Turns out that for the aarch64 jail (PkgBase system and >> default code generation), stage started about 10 min into >> the boost-libs activity. Package started somewhat under 5 >> minutes later. End result: >> >> [05:55:56] [03] [05:33:12] Finished lang/rust | rust-1.76.0: Success >> . . . >> [06:04:37] [01] [05:41:53] Finished devel/llvm18@default | llvm18-18.1.0.r3: Success >> [06:04:39] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0 >> [06:20:50] [01] [00:16:11] Finished devel/boost-libs | boost-libs-1.84.0: Success >> >> So, very similar to the armv7 jail result for a PkgBase >> system context (aarch64 boot and aarch64/armv7 PkgBase >> jail, default code generation). >> >> It appears that the boost-libs "stage" phase is the context >> for my question. >> >> For the jail code generation being based on -mcpu=cortex-a76 >> code generation but the boot having been PkgBase based: >> >> Stage started about 11.5 min into the boost-libs activity. >> Package started around 48 minutes later. End result >> (showing only boost-libs): >> >> [01:07:01] [01] [01:06:31] Finished devel/boost-libs | boost-libs-1.84.0: Success >> >> I'll note that bjam stays around 100% CPU in top during this >> much longer "stage" phase. Definitely less than the 01:13:23 >> time. MWCHAN "-", STAT RJ, PRI 135 when I looked. >> >> For the jail code generation and boot context both being based >> on -mcpu=cortex-a76 code generation: >> >> [05:16:38] [01] [00:49:22] Finished devel/boost-libs | boost-libs-1.83.0_1: Success >> >> Also definitely less than the 01:13:23 time. (I'm showing here an >> earlier test when it was boost-libs v1.83.) >> >> A ZFS context (instead of UFS context) showed: >> >> [04:37:47] [03] [04:03:16] Finished lang/rust | rust-1.76.0: Success >> . . . >> [04:43:47] [01] [04:09:16] Finished devel/llvm18@default | llvm18-18.1.0.r3: Success >> [04:43:48] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0 >> [05:41:46] [01] [00:57:58] Finished devel/boost-libs | boost-libs-1.84.0: Success > > Another test case . . . > > So I booted the -mcpu-cortex-a76 kernel and world (kernel > configures to use LSE atomics) and chroot'd into a aarch64 > PkgBase world that has a PkgBase jail for poudiere use and > that uses default code generation. > > Stage started about 9 min into the boost-libs activity. > Package started around 4.5 minutes later. End result > (showing only boost-libs): > > [00:15:56] [01] [00:15:48] Finished devel/boost-libs | boost-libs-1.84.0: Success > Given that, I then tried the armv7 poudriere jail from inside the chroot into the PkgBase aarch64, boot kernel and world still being the -mcpu=cortex-a76 and LSE atomics context: [00:15:08] [01] [00:14:57] Finished devel/boost-libs | boost-libs-1.84.0: Success But my earlier reported test without that chroot involved (so from a -mcpu=cortex-a76 world) was 01:13:23 instead of 00:14:57 : [06:59:23] [01] [01:13:23] Finished devel/boost-libs | boost-libs-1.84.0: Success So it appears that having the poudriere armv7 jail matched with a closest-containing PkgBase aarch64 world (chroot here), and a kernel using LSE atomics, gets the shorter time frame for the armv7 context. (I've never seen the shorter time frame on a cortex-a72/a53 for which LSE atomics is not an option, non matter if armv7 was involved or not.) === Mark Millard marklmi at yahoo.com