Re: git: 9a2ae72421cd - main - libthr: switch thread and sleepq memory allocator to crt from libc malloc

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 14 Jan 2025 22:19:08 UTC
On Tue, Jan 14, 2025 at 03:42:52PM -0500, Mark Johnston wrote:
> On Tue, Jan 14, 2025 at 05:55:15PM +0000, Konstantin Belousov wrote:
> > The branch main has been updated by kib:
> >
> > URL: https://cgit.FreeBSD.org/src/commit/?id=9a2ae72421cd75c741984f63b8c9ee89346a188d
> >
> > commit 9a2ae72421cd75c741984f63b8c9ee89346a188d
> > Author:     Konstantin Belousov <kib@FreeBSD.org>
> > AuthorDate: 2025-01-14 09:06:58 +0000
> > Commit:     Konstantin Belousov <kib@FreeBSD.org>
> > CommitDate: 2025-01-14 17:55:08 +0000
> >
> >     libthr: switch thread and sleepq memory allocator to crt from libc malloc
> >
> >     There are more complex interactions between malloc and libthr
> >     initialization that can happen if libthr functions are called from ELF
> >     object' constructors, before libthr is initialized.  Break the
> >     dependencies loop by using the private allocator with controlled init.
> >
> >     Reported by:    yuri
> >     Reviewed by:    markj, olce
> >     Sponsored by:   The FreeBSD Foundation
> >     MFC after:      1 week
> >     Differential revision:  https://reviews.freebsd.org/D48454
> 
> I see some startup deadlock when running the googletest regression tests
> (/usr/tests/lib/googletest/gmock_main) after this commit.  gdb (which
> itself also hangs due to this bug) shows:
> 
> (gdb) bt
> #0  _umtx_op_err () at /home/markj/sb/main/src/lib/libsys/amd64/_umtx_op_err.S:38
> #1  0x000015e1ba96fd2c in __thr_umutex_lock (mtx=0x15e1ba974468, id=100113) at /usr/src/lib/libthr/thread/thr_umtx.c:69
> #2  0x000015e1ba966a41 in __thr_calloc (num=1, size=17) at /usr/src/lib/libthr/thread/thr_malloc.c:92
> #3  0x000015e1ba969213 in mutex_init (mutex=warning: (Internal error: pc 0x15e1bd5c0240 in read in CU, but not in symtab.)
> warning: (Error: pc 0x15e1bd5c0240 in address map, but not in symtab.)

The following fixed the issue for me.  I am somewhat surprised that the
problem did not manifested itself before.

commit 783d95d0d6e6e508705cf16cfd9e4a5e2f8db8e4
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Wed Jan 15 00:11:48 2025 +0200

    libpthread_init(): ensure curthread == NULL until set explicitly
    
    Otherwise libthr::_get_curthread() returns a garbage kept there from
    allocate_initial_tls(), until libthr initialization proceeds enough to
    set initial pcb->pcb_thread.  The garbage pcb_thread was dereferenced
    as struct pthread and some memory read as TID.  Since it was not
    consistent between reads, thr_malloc_umtx unlock returned EPERM instead
    of clearing the lock word.
    
    Reported by:    markj
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week

diff --git a/lib/libthr/thread/thr_init.c b/lib/libthr/thread/thr_init.c
index 708c425d69c1..e5e438897dee 100644
--- a/lib/libthr/thread/thr_init.c
+++ b/lib/libthr/thread/thr_init.c
@@ -334,6 +334,7 @@ _libpthread_init(struct pthread *curthread)
 	/* Set the initial thread. */
 	if (curthread == NULL) {
 		first = 1;
+		_tcb_get()->tcb_thread = NULL;
 		/* Create and initialize the initial thread. */
 		curthread = _thr_alloc(NULL);
 		if (curthread == NULL)