Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

From: Philip Paeps <philip_at_freebsd.org>
Date: Wed, 24 Apr 2024 04:30:41 UTC
On 2024-04-24 02:12:41 (+0800), Mark Millard wrote:

> On Apr 19, 2024, at 07:16, Philip Paeps <philip@freebsd.org> wrote:
>
>> On 2024-04-18 23:02:30 (+0800), Mark Millard wrote:
>>
>>> void <void_at_f-m.fm> wrote on
>>> Date: Thu, 18 Apr 2024 14:08:36 UTC :
>>>
>>>> Not sure where to post this..
>>>>
>>>> The last bulk build for arm64 appears to have happened around
>>>> mid-March on ampere2. Is it broken?
>>>
>>> main-armv7 building is broken and the last completed build
>>> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
>>> gets stuck making no progress until manually forced to stop,
>>> which leads to huge elapsed times for the incomplete builds:
>>>
>>> pd5512ae7b8c6_s75464941dc 34472 12282  (+9196) 107  (+77) 4753  
>>> (+2247) 1390  (+529) 15940 parallel_build: Fri, 22 Mar 2024 11:05:01 
>>> GMT 651:21:56
>>>
>>> p43e3af5f5763_sf5f08e41aa 19809 5919  (+3126) 137  (+100) 5363  
>>> (+2741) 1395  (+522) 6995 parallel_build: Wed, 28 Feb 2024 15:46:14 
>>> GMT 359:42:14 ampere2
>>>
>>> ampere2 alternates between trying to build main-arm64 and 
>>> main-armv7, so main-armv7 being stuck blocks main-arm64 from 
>>> building.
>>>
>>> One can see that all 13 job ID's show over 570 hours:
>>>
>>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default&build=pd5512ae7b8c6_s75464941dc
>>>
>>> It is not random which packages are building when this happens. 
>>> Compare:
>>>
>>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default&build=p43e3af5f5763_sf5f08e41aa
>>>
>>> By contrast, the 19 Feb 2024 from-scratch (full) build worked:
>>>
>>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default&build=pe9c9c73181b5_sbd45bbe440
>>>
>>> My guess is that FreeBSD has something that broken after bd45bbe440
>>> that was broken as of f5f08e41aa and was still broken at 75464941dc 
>>> .
>>
>> I'll kill the build on ampere2 again.  Thanks for the nudge.
>>
>> We don't really have good monitoring for this.  Also: builds should 
>> time out after 36 hours.  The fact that this one does not is a bug in 
>> itself.
>>
>> Philip [hat: clusteradm]
>
> I'll note that I've never managed to replicate the problem for
> building for armv7 on aarch64. But my context never has the
> likes of:
>
> QUOTE
> Host OSVERSION: 1500006
> Jail OSVERSION: 1500015
> . . .
> !!! Jail is newer than host. (Jail: 1500015, Host: 1500006) !!!
> !!! This is not supported. !!!
> !!! Host kernel must be same or newer than jail. !!!
> !!! Expect build failures. !!!
> END QUOTE
>
> but always has the two OSVERSION's the same, such as:
>
> Host OSVERSION: 1500015
> Jail OSVERSION: 1500015
>
> or, recently,
>
> Host OSVERSION: 1500018
> Jail OSVERSION: 1500018
>
> My bulk runs do go through the sequence where the hangups
> have repeated for main-armv7 on ampere2.
>
> I wonder what would happen if "Host OSVERSION" was updated
> (modernized) to match the modern "Jail OSVERSION" that would
> be used?

The package builders are due for a regular refresh to newer -CURRENT 
dogfood.  I'll do the aarch64 builders first this time.

I've set /root/stop-builds on them.  I'll upgrade them when they go 
idle.  Or I'll kill them if they take much longer to build what they're 
building.  It annoys me that they do not stop building after 36 hours, 
like they're supposed to.

They're currently running:

n266879-6abee52e0d79   2023-12-09 01:06:28 jlduran strfmon: Silence 
scan-build warning

Our current clusteradm build is:

n269399-bbc6e6c5ec8c   2024-04-14 03:12:36 sigsys daemon: fix -R to 
enable supervision mode

I may do a new build while waiting for them to go idle:

-   quarterly 140arm64 1b931669de11 parallel_build 28776 15299   33  588 
    985     0  11871 3D:01:08:29 
https://pkg-status.freebsd.org/ampere1/build.html?mastername=140arm64-quarterly&build=1b931669de11
-   default main-arm64 p1c7a816cd0ad_s1bd4f769caf parallel_build 34528 
19888   65  669    980     0  12926 4D:00:52:21 
https://pkg-status.freebsd.org/ampere2/build.html?mastername=main-arm64-default&build=p1c7a816cd0ad_s1bd4f769caf
-   default 140releng-armv7 2910ff97e727 parallel_build 34543 14826   60 
5539   1397     0  12721 1D:09:35:28 
https://pkg-status.freebsd.org/ampere3/build.html?mastername=140releng-armv7-default&build=2910ff97e727

Philip