Re: armv7-on-aarch64 stuck at urdlck

From: Michal Meloun <meloun.michal_at_gmail.com>
Date: Tue, 23 Jul 2024 17:46:51 UTC

On 23.07.2024 11:36, Konstantin Belousov wrote:
> On Tue, Jul 23, 2024 at 09:53:41AM +0200, Michal Meloun wrote:
>> The good news is that I'm finally able to generate a working/locking
>> test case.  The culprit (at least for me) is if "-mcpu" is used when
>> compiling libthr (e.g. indirectly injected via CPUTYPE in /etc/make.conf).
>> If it is not used, libthr is broken (regardless of -O level or debug/normal
>> build), but -mcpu=cortex-a15 will always produce a working libthr.
> 
> I think this is very significant progress.
> 
> Do you plan to drill down more to see what is going on?

So the problem is now clear, and I fear it may apply to other 
architectures as well.
dlopen_object() (from rtld_elf),
https://cgit.freebsd.org/src/tree/libexec/rtld-elf/rtld.c#n3766,
holds the rtld_bind_lock write lock for almost the entire time a new 
library is loaded.
If the code uses a yet unresolved symbol to load the library, the 
rtl_bind() function attempts to get read lock of  rtld_bind_lock and a 
deadlock occurs.

In this case, it round_up() in _thr_stack_fix_protection,
https://cgit.freebsd.org/src/tree/lib/libthr/thread/thr_stack.c#n136.
Issued by __aeabi_uidiv (since not all armv7 processors support HW divide).

Unfortunately, I'm not sure how to fix it.  The compiler can emit 
__aeabi_<> in any place, and I'm not sure if it can resolve all the 
symbols used by rtld_eld and libthr beforehand.


Michal