Re: armv7-on-aarch64 stuck at urdlck

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 22 Jul 2024 16:26:14 UTC
On Jul 22, 2024, at 06:40, Michal Meloun <meloun.michal@gmail.com> wrote:

> On 22.07.2024 13:46, Mark Millard wrote:
>> On Jul 21, 2024, at 22:59, Michal Meloun <meloun.michal@gmail.com> wrote:
>>> I don't want to hijack the original thread, so I'm replying in a new one.
>>> 
>>> My tegra track current, has been running 24/7 by building kernel/world and kde5 in a loop for a few years now. But I have never encountered the aforementioned lockup in native armv7.
>>> 
>>> I have seen usermode mutex lockup in arm32 jail on aarch64, but only very rarely (once a month or so) and all my attempts to reproduce it in a more deterministic way have failed. Also, I don't think I've ever seen this with the debug version of libc.
>>> 
>>> Unfortunately I also failed to reproduce given lockup using dlopen_test.c, neither on native armv7 or arm32 jail.
>>> 
>>> Michal Meloun
>> What is the output of:
>> # readelf -a /libexec/ld-elf.so.1 | grep -E "(^[^ 0-9]|.*_rtld_get_stack_prot)"
>> in your armv7 context(s)? Does it include for likes of:
>> QUOTE
>> Symbol table '.symtab' contains 911 entries:
>>  903: 000000000001b9ac    16 FUNC    GLOBAL DEFAULT   11 _rtld_get_stack_prot
>> END QUOTE
>> `
>> vs. not?
>> Note that the "debug version of libc" being involved likely means that
>> DEBUG_FLAGS was defined. That in turn likely means that strip is not
>> being used. In such a case, I expect that the .symtab entry for
>> _rtld_get_stack_prot (and more) exists for such a context.
> At tis time, I have standard (thus stripped, non-debug) version of runtime linker library installed. Thus it have only dynamic relocation record for _rtld_get_stack_prot:
> 
> root@tegra124:~/dlopen_test # readelf -a /libexec/ld-elf.so.1 | grep -E "(^[^ 0-9]|.*_rtld_get_stack_prot)"
> ELF Header:
> Elf file type is DYN (Shared object file)
> Entry point 0x1449c
> There are 10 program headers, starting at offset 52
> Program Headers:
> There are 23 section headers, starting at offset 0x1a448:
> Section Headers:
> Key to Flags:
> Dynamic section at offset 0x19fa4 contains 15 entries:
> Relocation section (.rel.dyn):
> r_offset r_info   r_type              st_value st_name
> Symbol table '.dynsym' contains 27 entries:
>     5: 000000000001ba0c    16 FUNC    GLOBAL DEFAULT   12 _rtld_get_stack_prot@@FBSDprivate_1.0 (11)
> Notes at offset 0x00000174 with length 0x00000018:
> Histogram for bucket list length (total of 6 buckets):
> Histogram for bucket list length (total of 27 buckets):
> Version symbol section (.gnu.version):
> Version definition section (.gnu.version_d):
> Attribute Section: aeabi
> 
> ------
> 
> root@tegra124:~/dlopen_test # ./dlopen_test
> root@tegra124:~/dlopen_test #

Just to be sure . . .

Did you at some point "pkg install cairo" (or analogous) so that
the following (or some vintage) were in place?

# ls -lodT /usr/local/lib/libcairo.so*
lrwxr-xr-x  1 root wheel -      21 Apr 29 19:45:15 2024 /usr/local/lib/libcairo.so -> libcairo.so.2.11704.0
lrwxr-xr-x  1 root wheel -      21 Apr 29 19:45:15 2024 /usr/local/lib/libcairo.so.2 -> libcairo.so.2.11704.0
-rwxr-xr-x  1 root wheel - 1118272 Apr 29 19:45:15 2024 /usr/local/lib/libcairo.so.2.11704.0

# file /usr/local/lib/libcairo.so.2.11704.0 
/usr/local/lib/libcairo.so.2.11704.0: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (FreeBSD), dynamically linked, for FreeBSD 15.0 (1500018), stripped

(Installing cairo would also install other things it needs.)

For the failing contexts, the a.out from dlopen_test.c will only
hang if the library (and what it requires) is actually there to
load.

===
Mark Millard
marklmi at yahoo.com