From nobody Wed Jul 24 10:50:18 2024 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WTW4y4SyJz5RkMV; Wed, 24 Jul 2024 10:50:26 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4WTW4y2RCSz4Tqg; Wed, 24 Jul 2024 10:50:26 +0000 (UTC) (envelope-from kib@freebsd.org) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 46OAoIAw049940; Wed, 24 Jul 2024 13:50:21 +0300 (EEST) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 46OAoIAw049940 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 46OAoIfO049939; Wed, 24 Jul 2024 13:50:18 +0300 (EEST) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Wed, 24 Jul 2024 13:50:18 +0300 From: Konstantin Belousov To: mmel@freebsd.org Cc: John F Carr , Mark Millard , FreeBSD Current , "freebsd-arm@freebsd.org" Subject: Re: armv7-on-aarch64 stuck at urdlck Message-ID: References: <33251aa3-681f-4d17-afe9-953490afeaf0@gmail.com> <0DD19771-3AAB-469E-981B-1203F1C28233@yahoo.com> <6a969609-fa0e-419d-83d5-e4fcf0f6ec35@freebsd.org> List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=4.0.1 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on tom.home X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4WTW4y2RCSz4Tqg On Wed, Jul 24, 2024 at 12:34:57PM +0200, mmel@freebsd.org wrote: > > > On 24.07.2024 12:24, Konstantin Belousov wrote: > > On Tue, Jul 23, 2024 at 08:11:13PM +0000, John F Carr wrote: > > > On Jul 23, 2024, at 13:46, Michal Meloun wrote: > > > > > > > > On 23.07.2024 11:36, Konstantin Belousov wrote: > > > > > On Tue, Jul 23, 2024 at 09:53:41AM +0200, Michal Meloun wrote: > > > > > > The good news is that I'm finally able to generate a working/locking > > > > > > test case. The culprit (at least for me) is if "-mcpu" is used when > > > > > > compiling libthr (e.g. indirectly injected via CPUTYPE in /etc/make.conf). > > > > > > If it is not used, libthr is broken (regardless of -O level or debug/normal > > > > > > build), but -mcpu=cortex-a15 will always produce a working libthr. > > > > > I think this is very significant progress. > > > > > Do you plan to drill down more to see what is going on? > > > > > > > > So the problem is now clear, and I fear it may apply to other architectures as well. > > > > dlopen_object() (from rtld_elf), > > > > https://cgit.freebsd.org/src/tree/libexec/rtld-elf/rtld.c#n3766, > > > > holds the rtld_bind_lock write lock for almost the entire time a new library is loaded. > > > > If the code uses a yet unresolved symbol to load the library, the rtl_bind() function attempts to get read lock of rtld_bind_lock and a deadlock occurs. > > > > > > > > In this case, it round_up() in _thr_stack_fix_protection, > > > > https://cgit.freebsd.org/src/tree/lib/libthr/thread/thr_stack.c#n136. > > > > Issued by __aeabi_uidiv (since not all armv7 processors support HW divide). > > > > > > > > Unfortunately, I'm not sure how to fix it. The compiler can emit __aeabi_<> in any place, and I'm not sure if it can resolve all the symbols used by rtld_eld and libthr beforehand. > > > > > > > > > > > > Michal > > > > > > > > > > In this case (but not for all _aeabi_ functions) we can avoid division > > > as long as page size is a power of 2. > > > > > > The function is > > > > > > static inline size_t > > > round_up(size_t size) > > > { > > > if (size % _thr_page_size != 0) > > > size = ((size / _thr_page_size) + 1) * > > > _thr_page_size; > > > return size; > > > } > > > > > > The body can be condensed to > > > > > > return (size + _thr_page_size - 1) & ~(_thr_page_size - 1); > > > > > > This is shorter in both lines of code and instruction bytes. > > > > Lets not allow this to be lost. Could anybody confirm that the patch > > below fixes the issue? > > > > commit d560f4f6690a48476565278fd07ca131bf4eeb3c > > Author: Konstantin Belousov > > Date: Wed Jul 24 13:17:55 2024 +0300 > > > > rtld: avoid division in __thr_map_stacks_exec() > > The function is called by rtld with the rtld bind lock write-locked, > > when fixing the stack permission during dso load. Not every ARMv7 CPU > > supports the div, which causes the recursive entry into rtld to resolve > > the __aeabi_uidiv symbol, causing self-lock. > > Workaround the problem by using roundup2() instead of open-coding less > > efficient formula. > > Diagnosed by: mmel > > Based on submission by: John F Carr > > Sponsored by: The FreeBSD Foundation > > MFC after: 1 week > > Just realized that it is wrong. Stack size is user-controlled and it does not need to be power of two. > For final resolving of deadlocks, after a full day of digging, I'm very much > incline of adding -znow to the linker flags for libthr.so (and maybe also > for ld-elf.so). The runtime cost of resolving all symbols at startup is very > low. Direct pre-solving in _thr_rtld_init() is problematic for the _aeabi_* > symbols, since they don't have an official C prototypes, and some are not > compatible with C calling conventions. I do not like it. `-z now' changes (breaks) the ABI and makes some symbols not preemtible. In the worst case, we would need a call to the asm routine which causes the resolution of the _eabi_* symbols on arm. > > Warner, Konstantin, could you please comment on this? > > > Michal