Re: A better alternative to having builds of main-armv7-default fully disabled and last-built be months out of date

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sun, 07 Jul 2024 08:25:32 UTC
On Jul 6, 2024, at 21:35, Michal Meloun <meloun.michal@gmail.com> wrote:

> On 07.07.2024 5:42, Mark Millard wrote:
>> main's armv7 packages that are distributed are getting to be months
>> behind because of the build hangups preventing the builds on ampere2.
>> The hangups happen just-after graphics/graphviz installation during
>> the activity in a builder where that build depends on
>> graphics/graphviz .
>> I expect that the armv7 "bulk -a" builds on ampere2 would complete
>> if the Makefile for graphics/graphviz had:
>> BROKEN_armv7= leads to ampere2 build hangups for builds that depended on graphics/graphviz
>> A related subset of the packages would not be built at all. But that
>> is better for security and such than the official packages that are
>> available being systematically months out of date, at least in my view.
>> I suggest trying the chnage and enabling main-armv7-default builds
>> to see if they complete overall.
>> I'll note that there is a  hostorical example of a graphics/giflib
>> build failure that lead to 3481 ports not being built, including
>> graphics/graphviz . But the "bulk -a" completed and 24176 packages
>> built and were distributed.
>> graphics/graphviz having BROKEN_armv7 should generaelly build more
>> packages than that when graphics/giflib builds okay.
>> ===
>> Mark Millard
>> marklmi at yahoo.com
> graphics/graphviz can be built on native armv7 without any problems,

armv7 graphics/graphviz builds on ampere2. The problem is later
when/just-after graphics/graphviz is installed for use in some later
package's build. The log files for the hangups end with the likes of:

. . .
[main-armv7-default-job-01] `-- Extracting pango-1.50.14: .......... done
[main-armv7-default-job-01] Extracting graphviz-9.0.0_4: .......... done

and the elapsed time for the builder continues to progress, even after
hundreds of hours. This happens for such activity during any of:

build-depends
lib-depends
run-depends

Of course the actual failure is between the output of:

[main-armv7-default-job-01] Extracting graphviz-9.0.0_4: .......... done

and whatever line would normally be next. But BROKEN_armv7=
for graphviz would prevent such a time frame from even being
involved. (Yes, it is a hack to get partial "bulk -a" builds
going. I just claim the hack is appropriate for now.)

I've never been able to replicate the failure on any of:

Windows DevKit 2023
HoneyComb
RPi5B
RPi4B (various 4 GiByte and 8 GiByte)

(I've not tried on a MACCHIATObin Double Shot, a Rock64,
a RPi3B, or a RPi2B v1.2 that are around.)

The only known failures are on ampere2 as far as I know.
As far as I know there is no known way to configure to
match the formal build procedures used on ampere2. So
there could be all sorts of variations involved in my
testing that I did vs. what is happening when official
builds for armv7 are attempted on ampere2, even
ignoring the hardware differences that are also
involved.

I do not have access to ampere2 like hardware.

Note that stable/1[34] and releng/1[34].* builds have
never shown the armv7 problem. Only main.

The history of successful from-scratch "bulk -a" for
armv7 on ampere2 was (pkg build log output lines):

build started at Fri Aug 18 17:18:19 UTC 2023
build started at Mon Sep  4 15:45:39 UTC 2023
build started at Tue Sep 26 23:29:39 UTC 2023
build started at Tue Oct 24 20:54:39 UTC 2023
build started at Sat Nov 11 01:00:52 UTC 2023
build started at Fri Dec  8 10:55:56 UTC 2023
build started at Wed Dec 20 01:47:25 UTC 2023
build started at Sun Dec 31 22:33:56 UTC 2023
build started at Sat Jan 27 10:57:56 UTC 2024
build started at Thu Feb  8 03:00:30 UTC 2024
build started at Mon Feb 19 12:47:46 UTC 2024

Not a from-scratch "bulk -a" but was a failure for the issue:
build started at Wed Feb 28 16:05:30 UTC 2024 (for: dns/public_suffix_list)
build started at Wed May  8 01:59:35 UTC 2024 (for: ports-mgmt/pkg)

From-scratch "bulk -a" Failures:

build started at Fri Mar 22 11:19:45 UTC 2024
build started at Fri Apr 26 09:30:15 UTC 2024

Note: for "bulk -a" not being from-scratch but being
successful overall, figuring out if any graphviz
installs were involved is a pain. I've not tried
to figure such out.

Overall, it suggests the change happend sometime
between:

pe9c9c73181b5_sbd45bbe440 (worked on 2024-Feb-19)
and:
p43e3af5f5763_sf5f08e41aa (failed on 2024-Feb-28)

So for FreeBSD main:

        • git: bd45bbe440f1 - main - rescue: Fix after zfsbootcfg addition Warner Losh 
Tue, 13 Feb 2024
. . .
Sun, 25 Feb 2024
. . .
    • git: f5f08e41aa57 - main - loader/efi: Only include interpreter's linker script Warner Losh

As for ports:

Tue, 13 Feb 2024
. . .
    • git: e9c9c73181b5 - main - graphics/mesa-devel: update to 24.0.b.1355 Jan Beich
. . .
Sun, 25 Feb 2024
. . .
    • git: 43e3af5f5763 - main - www/remark42: relax npm install dependency requirement. Xin LI


> so it looks like a compat32 problem.

Not systematically across the variability in contexts.
Something more specific is likely involved as a required
context, not that I've a clue what such might be.

> Unfortunately I don't have my honeycomb ready to test this inside arm32 jail.
> 
> Are you able to try to prepare some testcase?

All my from-scratch "bulk -a" tests for targeting armv7 have
worked just fine, continuing on normally after the likes of:

[main-armv7-default-job-01] Extracting graphviz-9.0.0_4: .......... done

> I've seen some strange live lockups in arm32 jail, but never managed to reproduce it.

On what kind(s) of hardware?

Any kind of relevant context known?

===
Mark Millard
marklmi at yahoo.com