Re: A better alternative to having builds of main-armv7-default fully disabled and last-built be months out of date
Date: Mon, 15 Jul 2024 05:02:03 UTC
On Jul 14, 2024, at 17:20, Philip Paeps <philip@freebsd.org> wrote: > Sorry for not following up to this thread earlier. > I've been occupied elsewhere in the cluster. > > On 2024-07-07 16:25:32 (+0800), Mark Millard wrote: >> On Jul 6, 2024, at 21:35, Michal Meloun <meloun.michal@gmail.com> wrote: >>> On 07.07.2024 5:42, Mark Millard wrote: >>>> main's armv7 packages that are distributed are getting to be months >>>> behind because of the build hangups preventing the builds on ampere2. > > It's worth reinforcing that this only affects main (15-CURRENT). Our stable/13 and stable/14 packages for armv7 are reasonably up to date. Reasonably for a tier-2 architecture anyway. Whatever is causing this, it's only in main. > >> The only known failures are on ampere2 as far as I know. >> As far as I know there is no known way to configure to >> match the formal build procedures used on ampere2. > > According to the current schedule, armv7 builds happen on ampere3, not ampere2: > > ampere1: - quarterly arm64.aarch64 13.3-RELEASE 133arm64 -a > ampere1: - quarterly arm.armv7 releng/13.3 133releng-armv7 -a > ampere1: - quarterly arm64.aarch64 14.0-RELEASE 140arm64 -a > ampere1: - quarterly arm.armv7 releng/14.0 140releng-armv7 -a > ampere2: - default arm64.aarch64 main main-arm64 -a > ampere3: - default arm64.aarch64 13.3-RELEASE 133arm64 -a > ampere3: - default arm.armv7 releng/13.3 133releng-armv7 -a > ampere3: - default arm64.aarch64 14.0-RELEASE 140arm64 -a > ampere3: - default arm.armv7 releng/14.0 140releng-armv7 -a Putting the ones that mention armv7 together, with the others omited: ampere1: - quarterly arm.armv7 releng/13.3 133releng-armv7 -a ampere1: - quarterly arm.armv7 releng/14.0 140releng-armv7 -a ampere3: - default arm.armv7 releng/13.3 133releng-armv7 -a ampere3: - default arm.armv7 releng/14.0 140releng-armv7 -a None of those are for main [so: 15]. They are all for working contexts. main-armv7-default last ran on ampere2 2024-05-31/2024-06-01 or so. I'm not aware of any main-armv7-default builds done via ampere1 or ampere3. > I've attached the poudriere.conf from that machine. It's the same one we have on all the builders. Is the poudriere.conf content the same as for the main [so: 15] context (ampere2) and the ampere3 context(s)? >>> I've seen some strange live lockups in arm32 jail, but never managed to reproduce it. >> >> On what kind(s) of hardware? >> Any kind of relevant context known? > > In case it helps: ref15-aarch64.freebsd.org (available to all developers) is an identical configuration as ampereX.nyi.freebsd.org. The former has a newer BIOS (for some reason) but that hopefully should not make a difference. If we reach the point where we think the BIOS version matters, I can try to upgrade the BIOS on the ampereXen. > > smbios.bios.reldate="06/25/2020" > smbios.bios.revision="1.14" > smbios.bios.vendor="LENOVO" > smbios.bios.version="hve104q-1.14" > > smbios.bios.reldate="05/30/2019" > smbios.bios.revision="1.8" > smbios.bios.vendor="LENOVO" > smbios.bios.version="HVE104J-1.08" Looking at the poudriere.conf example, it points out another difference for my more recent testing: strictly UFS contexts for my aarch64 and armv7 systems these days. The only media that is ZFS based these days for any system in my active use is for the 7950X3D (amd64). My switching to UFS matches up with my switching to use pkgbase to install and test official FreeBSD builds (all of: kernel, world, ports) for comparison/contrast with my personal builds of such (that involves some locally patched files). We do know when the last successful from-scratch "bulk -a" involving graphics/graphviz was before the armv7 problems started: the build of pkg started on Feb 19: pe9c9c73181b5_sbd45bbe440 =>> Building ports-mgmt/pkg build started at Mon Feb 19 12:47:46 UTC 2024 port directory: /usr/ports/ports-mgmt/pkg package name: pkg-1.20.9_1 building for: FreeBSD main-armv7-default-job-01 15.0-CURRENT FreeBSD 15.0-CURRENT 1500014 arm maintained by: pkg@FreeBSD.org Makefile datestamp: -rw-r--r-- 1 root wheel 2311 Feb 1 01:02 /usr/ports/ports-mgmt/pkg/Makefile Ports top last git commit: e9c9c73181b Ports top unclean checkout: no Port dir last git commit: f7f4c1a0472 Port dir unclean checkout: no Poudriere version: poudriere-git-3.4.1 Host OSVERSION: 1500006 Jail OSVERSION: 1500014 Job Id: 01 We also know the first observed failure with the symptoms (not a from-scratch build), where it started with a dns/public_suffix_list build that was dated Feb 28: p43e3af5f5763_sf5f08e41aa =>> Building dns/public_suffix_list build started at Wed Feb 28 16:05:30 UTC 2024 port directory: /usr/ports/dns/public_suffix_list package name: public_suffix_list-20240130 building for: FreeBSD main-armv7-default-job-07 15.0-CURRENT FreeBSD 15.0-CURRENT 1500014 arm maintained by: sunpoet@FreeBSD.org Makefile datestamp: -rw-r--r-- 1 root wheel 770 Feb 25 01:02 /usr/ports/dns/public_suffix_list/Makefile Ports top last git commit: 43e3af5f576 Ports top unclean checkout: no Port dir last git commit: 906be52cfb7 Port dir unclean checkout: no Poudriere version: poudriere-git-3.4.1-1-g1e9f97d6 Host OSVERSION: 1500006 Jail OSVERSION: 1500014 Job Id: 07 Bisection of the kernel/world combinations between would be very disruptive to other uses of the machine doing the bisections. But such would be one way of trying to narrow down what change(s) lead to the problem showing up for main [so: 15]. So for FreeBSD kernel/world main that would be over: • git: bd45bbe440f1 - main - rescue: Fix after zfsbootcfg addition Warner Losh Tue, 13 Feb 2024 . . . Sun, 25 Feb 2024 . . . • git: f5f08e41aa57 - main - loader/efi: Only include interpreter's linker script Warner Losh Looks like that is something like around 120 commits to main [so: 15]. But for _sbd45bbe440 and _sf5f08e41aa I'm not so sure that the kernel booted matches the system commits referenced. If not, the specific kernel build does not seem to be identified in anything that I have access to. Nothing like the output from the likes of: # uname -v FreeBSD 15.0-CURRENT main-n270963-609cdb12b962 GENERIC is in the build log output (presumes a context with UNAME_v not overriding what would be shown for the specific output). For main, freebsd-version output is not appropriately detailed of an identification for the purpose. === Mark Millard marklmi at yahoo.com