poudriere-devel for UFS vs. ZFS: time for "Starting/Cloning builders", aarch64 example (HoneyComb with 16 Cortex-A72's)
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 09 Sep 2021 23:35:11 UTC
The following is from the same system but different boot media selected in EDK2's UEFI/ACPI. A summary for a 16-core HoneyComb Cortex-A72 system is UFS: 80sec+ ZFS: 10sec or so. I am limited to at most 16 cores for aarch64. HoneyComb (16 cpu) UFS (so: NO_ZFS=yes) with USE_TMPFS="data" (main system) : [00:00:28] Building 476 packages using 16 builders [00:00:28] Starting/Cloning builders [00:01:49] Hit CTRL+t at any time to see build progress and stats [00:01:49] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.17.1 HoneyComb (16 cpu) ZFS with USE_TMPFS="data" (main system) : [00:00:13] Building 475 packages using 16 builders [00:00:13] Starting/Cloning builders [00:00:23] Hit CTRL+t at any time to see build progress and stats [00:00:23] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.17.1 Both of these were after recently booting. Past experience is that UFS gets more than proportionally worse as the freeBSD cpu count goes up. But I've only had 4 and 16 for aarch64 and until recently had no ZFS context. I only had 4 (2 sockets/2 cores each) back in my old PowerMac "Quad Core" usage days but it was demonstrable back then as well. (The Quad Core PowerMacs bit the dust.) Past investigations suggested that th parallel cpdup's were spending much time spinning but not makeing progress, ending up showing getblk status during the spinning. I have at times modified the common.sh script to have the: # Jail might be lingering from previous build. Already recursively # destroyed all the builder datasets, so just try stopping the jail # and ignore any errors stop_builder "${id}" mkdir -p "${mnt}" clonefs ${MASTERMNT} ${mnt} prepkg code instead executed in a sequential loop just before the "parallel_start" . This helped cut the time for a UFS context by avoiding (busy) wait time. For reference (the ZFS example matches but for node names: # uname -apKU FreeBSD CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #13 main-n249019-0637070b5bca-dirty: Sat Sep 4 18:12:39 PDT 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400032 1400032 # cd /usr/ports # ~/fbsd-based-on-what-commit.sh branch: main merge-base: b0c4eaac2a3aa9bc422c21b9d398e4dbfea18736 merge-base: CommitDate: 2021-09-07 21:55:24 +0000 b0c4eaac2a3a (HEAD -> main, freebsd/main, freebsd/HEAD) security/suricata: Add patch for upstream locking fix n557269 (--first-parent --count for merge-base) But the issue is not new. The effect is visible on the ThreadRipper 1950X (32 FreeBSD cpus): (Again, one system but different boot media selected, also the same main and ports vintages.) ThreadRipper 1950X (32 cpu) UFS with USE_TMPFS=yes (releng/13.0) : [00:00:11] Building 111 packages using 32 builders [00:00:11] Starting/Cloning builders [00:00:51] Hit CTRL+t at any time to see build progress and stats [00:00:51] [01] [00:00:00] Building devel/meson | meson-0.59.1 ThreadRipper 1950X (32 cpu) ZFS with USE_TMPFS=yes (main) : [00:00:07] Building 535 packages using 32 builders [00:00:07] Starting/Cloning builders [00:00:18] Hit CTRL+t at any time to see build progress and stats [00:00:18] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.17.1 So UFS about 40 sec vs. ZFS about 11 sec. (I never did the contrasting ZFS case for old PowerMacs and no longer have access to such.) I'm guessing that the UFS context would not scale well to having 2 or more times as many freebsd cpus. (I do not have access to such.) I expect that other times when other parallel cpdup activity by poudriere is involved for UFS also ends up with the "getblk status" issue in some way. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)