Re: armv7 lang/gcc12 "no bootstrap" build via system clang 15.0.7 based poudriere build ends up stuck in a small loop

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 06 Mar 2023 17:12:15 UTC
On Mar 6, 2023, at 08:37, Lorenzo Salvadore <developer@lorenzosalvadore.it> wrote:

> ------- Original Message -------
> On Monday, March 6th, 2023 at 9:46 AM, Mark Millard <marklmi@yahoo.com> wrote:
> 
> 
>> 
>> 
>> Under main that has clang 15.0.7, I've had to locally
>> switch to using the likes of:
>> 
>> OPTIONS_DEFAULT_armv7=STANDARD_BOOTSTRAP
>> 
>> (to express it in Makefile terms) for lang/gcc12 in order
>> to avoid the following.
>> 
>> The no bootstrap build ends up stuck in small loop in partition_union
>> (in cc1):
>> 
>> (gdb) info threads
>> Id Target Id Frame
>> * 1 LWP 632886 of process 27787 0x016eb82c in partition_union ()
>> (gdb) bt
>> #0 0x016eb82c in partition_union ()
>> #1 0x0133e6ec in var_union(_var_map*, tree_node*, tree_node*) ()
>> #2 0x013218e4 in attempt_coalesce(_var_map*, ssa_conflicts*, int, int, __sFILE*) ()
>> #3 0x013203d0 in coalesce_ssa_name(_var_map*) ()
>> #4 0x012c66b4 in rewrite_out_of_ssa(ssaexpand*) ()
>> #5 0x0082c094 in (anonymous namespace)::pass_expand::execute(function*) ()
>> #6 0x00fd6ff0 in execute_one_pass(opt_pass*) ()
>> #7 0x00fd8380 in execute_pass_list_1(opt_pass*) ()
>> #8 0x00fc6df0 in execute_pass_list(function*, opt_pass*) ()
>> #9 0x00880c20 in cgraph_node::expand() ()
>> #10 0x00882d10 in symbol_table::compile() ()
>> #11 0x00883454 in symbol_table::finalize_compilation_unit() ()
>> #12 0x0120e204 in compile_file() ()
>> #13 0x0120d9d4 in toplev::main(int, char**) ()
>> #14 0x01646c28 in main ()
>> (gdb) finish
>> Run till exit from #0 0x016eb82c in partition_union ()
>> 
>> It never exits. I've walked through the short loop that ends
>> up with data that leads to no progress: bne always taken and
>> reaches a status of no change in the values involved happens
>> in the loop.
>> 
>> truss shows no output and no subroutines are called in the
>> few instruction long loop.
>> 
>> I ran multiple tests of "no bootstrap" and all failed the
>> same way.
>> 
>> Such would not be a good thing for the FreeBSD armv7 package
>> build server.
>> 
>> Also seen via lldb:
>> 
>> (lldb) bt
>> * thread #1, name = 'cc1', stop reason = signal SIGSTOP
>> * frame #0: 0x016eb82c cc1`partition_union + 152 frame #1: 0x0133e6ec cc1`var_union(_var_map*, tree_node*, tree_node*) + 104
>> frame #2: 0x013218e4 cc1`attempt_coalesce(_var_map*, ssa_conflicts*, int, int, __sFILE*) + 508 frame #3: 0x013203d0 cc1`coalesce_ssa_name(_var_map*) + 7240
>> frame #4: 0x012c66b4 cc1`rewrite_out_of_ssa(ssaexpand*) + 2020 frame #5: 0x0082c094 cc1`(anonymous namespace)::pass_expand::execute(function*) + 68
>> frame #6: 0x00fd6ff0 cc1`execute_one_pass(opt_pass*) + 616 frame #7: 0x00fd8380 cc1`execute_pass_list_1(opt_pass*) + 44
>> frame #8: 0x00fc6df0 cc1`execute_pass_list(function*, opt_pass*) + 40 frame #9: 0x00880c20 cc1`cgraph_node::expand() + 324
>> frame #10: 0x00882d10 cc1`symbol_table::compile() + 3860 frame #11: 0x00883454 cc1`symbol_table::finalize_compilation_unit() + 300
>> frame #12: 0x0120e204 cc1`compile_file() + 236 frame #13: 0x0120d9d4 cc1`toplev::main(int, char**) + 7028
>> frame #14: 0x01646c28 cc1`main + 48 frame #15: 0x004ad3f0 cc1`__start(argc=31, argv=0xffffadec, env=0xffffae6c, ps_strings=<unavailable>, obj=0x4181e004, cleanup=0x417ed4d8) at crt1_c.c:92:7
>> 
>> 
>> 
>> The armv7 STANDARD_BOOTSTRAP change lead to it reaching completion.
>> 
>> But the "no bootstrap" issue suggests that system-clang 15.0.7
>> has a problem for armv7 targeting. (I've not seen problems for
>> targeting aarch64 or amd64.)
>> 
>> 
>> For reference:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #88 main-n261230-e78dc78e517a-dirty: Wed Mar 1 16:17:45 PST 2023 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm armv7 1400081 1400081
>> 
>> via:
>> 
>> # poudriere jail -l
>> JAILNAME VERSION ARCH METHOD TIMESTAMP PATH
>> . . .
>> main-CA7 14.0-CURRENT arm.armv7 null 2021-06-27 17:58:33 /usr/obj/DESTDIRs/main-CA7-poud
>> . . .
>> 
>> on an aarch64 system, no qemu involved (or even installed):
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #88 main-n261230-e78dc78e517a-dirty: Wed Mar 1 16:17:45 PST 2023 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400081 1400081
>> 
>> (It is a 16 Cortex-A72 HoneyComb.)
> 
> Thanks Mark.
> 
> I guess cases like this are one of the reasons for bootstrapping existence:
> compilation with clang on armv7 probably is not the tipical case, so it
> does not work so easily as using GCC on amd64. Good that it works at least
> with bootstraping.
> 
> Now, I would like to suggest a few more experiments:

Some of the below have a partial answer from the fact that
the FreeBSD package builder system for armv7 is still
running system-clang 14 (main) or 13 (13.1-RELEASE) and
does not yet see the problem. (The build server's actual
kernel vintage should not be an issue to worry about.)

Nor did it have problems in the past building lang/gcc12.

This is a new issue.

> - does the compilation work without bootstrapping with lang/gcc13-devel?
> 
> - does the compilation work without bootstrapping with a higher version
> of clang (we have devel/llvm16 in the ports tree, which tracks a pre-release)?
> 
> - does the compilation work without bootstrapping on a release version of
> FreeBSD?

That is an example were the 13.1 based package builds on the
system used for armv7 builds did/does not have problem.

Nor do the main system-clang 14 based builds.

> - does the compilation work without bootstrapping using Linux instead
> of FreeBSD?

I'm not well set up for that kind of experiment.

> You might want to open a bug report, but you should try to understand
> first what is the component that causes the issue and if replacing anything
> with something newer (where the bug might be already fixed) or with
> something supported (since FreeBSD CURRENT is under development, we
> might have regressions) solves the problem.

It is already known to be a regression compared to
system-clang 14 and 13 based builds. No clue yet
for llvm16 based. (I've separate notes out about
building llvm16 for aarch64 needing a Makefile
fix.)

> If you find that the cause is in the FreeBSD GCC port(s), then please
> open a bug report on bugzilla so that I can keep track of it and other
> users with the same problem can find it there as well.
> 

===
Mark Millard
marklmi at yahoo.com