head -r339076 amd64 -> armv7 port cross build attempt with native tools involved: hangs between a cc (wait) and its child ld (uwait)
Mark Millard
marklmi at yahoo.com
Sun Oct 28 00:30:16 UTC 2018
[Just the __packed removal patch was sufficient to no longer
have the hang problem that I originally reported for the
print/texinfo build in poudriere.]
On 2018-Oct-27, at 4:33 PM, Mark Millard <marklmi at yahoo.com> wrote:
> [Some of this discussion occurred off list. The point here
> is not specific to the hang that I originally reported.]
>
> On 2018-Oct-27, at 3:03 PM, Mark Millard <marklmi at yahoo.com> wrote:
>>
Mikaël Urankar is being quoted below:
>>> . . .
>>>
>>>> There are bugs in qemu that can cause such deadlock, you can try these
>>>> 2 patches:
>>>> https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371
>>>> https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b
Back to me:
>>> I'll try those later. Thanks. (I need to get back to sleep.)
>>>
>>> It was interesting that attach/detach to the ld process
>>> caused it to progress. The rest of the build completed
>>> just fine. But that one spot consistently hung up before
>>> trying gdb to look at the back trace.
>>>
>>
>> Looking at the qemu code related to the 2nd patch: the
>> structure of the field copies (via __get_user) seems
>> very sensitive to the ABI rules for the target and
>> how things align and such, given that the structure
>> description and code are host code. __packed vs. not
>> is possibly not sufficient control to always make things
>> match right across all the potential combinations of
>> host and target from what I can see.
>>
>> Lack of __packed may prove sufficient for my specific
>> context (amd64 host and armv7 target) but it seems
>> non-obvious what to do in general.
>>
>> There would also seem to be big endian vs. little endian
>> issues on the individual __get_user styles of copies
>> when the host and target do not match for a multi-byte
>> numeric encoding.
>
> Well, I get the following for:
>
> #include "/usr/include/sys/event.h" // kevent
> #include <stddef.h> // offsetof
> #include <stdio.h> // printf
>
> int
> main()
> {
> printf("%lu\n", (unsigned long) sizeof(struct kevent));
> printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident));
> printf("filter %lu\n", (unsigned long) offsetof(struct kevent, filter));
> printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags));
> printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, fflags));
> printf("data %lu\n", (unsigned long) offsetof(struct kevent, data));
> printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata));
> printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext));
> return 0;
> }
>
> (This code avoided warnings for type mismatches with the
> printf strings and such.)
>
> amd64 native [host of qemu use] (comments hand added):
>
> # ./a.out
> 64
> ident 0
> filter 8 // NOTE!
> flags 10 // NOTE!
> fflags 12 // NOTE!
> data 16
> udata 24
> ext 32
>
> (The above is not particularly important but I
> include it for completeness.)
>
> armv7 native [target in qemu use] (comments hand added):
>
> # ./a.out
> 64 // NOTE vs. below!
> ident 0
> filter 4 // NOTE vs. above!
> flags 6 // NOTE vs. above!
> fflags 8 // NOTE vs. above!
> data 16 // NOTE vs. below!
> udata 24 // NOTE vs. below!
> ext 32 // NOTE vs. below!
>
> /usr/include/sys/event.h lacks __packed in both cases.
>
> With __packed in qemu-arm-static's source code
> for target_freebsd_kevent I confirm that via
> gdb for the qemu-arm-static:
>
> p/d sizeof(struct target_freebsd_kevent)
> p/d &((struct target_freebsd_kevent *)0)->ident
> p/d &((struct target_freebsd_kevent *)0)->filter
> p/d &((struct target_freebsd_kevent *)0)->flags
> p/d &((struct target_freebsd_kevent *)0)->fflags
> p/d &((struct target_freebsd_kevent *)0)->data
> p/d &((struct target_freebsd_kevent *)0)->udata
> p/d &((struct target_freebsd_kevent *)0)->ext
>
> reports as the 2nd patch's problem-report
> material reports (56,0,4,6,8,12,20,24): not
> even the right size.
>
> I also confirm that removing __packed in qemu's
> code and rebuilding and then checking with gdb
> reported a match to the above armv7 native report
> (64,0,4,6,8,16,24,32).
>
> I have not verified __packed used vs. not for any
> other combination of host and target platforms.
Removing the 2 examples of __packed, including the
1 for target_freebsd_kevent, as in Mikaël Urankar's
2nd listed patch, was sufficient to avoid the hang
that I originally reported. (Technically FreeBSD 11
is not involved and so one of the __packed removals
is not relevant to my example.)
I have not applied Mikaël Urankar's first listed
patch at all. It did not prove necessary for my
context.
Again: the only tested context is amd64 -> armv7
(host -> target) under a head -r339076 based
build. (So still 12.)
I'm doing a larger amd64 -> armv7 rebuild (around
210 ports overall) that originally included the
problematical hang and a full-bootstrap build
of lang/gcc8 (so extensive emulation use after
the clang-based stages). Prior to the patch,
all smaller attempts also hung at the same
place for print/texinfo.
But I'll only report if this larger test has
a problem.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-toolchain
mailing list