Re: Thread safety of getaddrinfo()/getnameinfo()

From: Paul Floyd <paulf2718_at_gmail.com>
Date: Wed, 26 Apr 2023 20:28:12 UTC

On 26-04-23 19:57, Felix Palmen wrote:

> So, is there a thread-safety issue with these functions, or a bug in
> valgrind, or maybe just some false positive?

A false positive is a bug.

No errors with asan. DRD also generates errors.

I believe that getaddrinfo should be MT safe.

I just pushed a fix to the Valgrind repo. If you like you can use it by 
building Valgrind from source (see 
https://valgrind.org/downloads/repository.html and then see 
README.freesd). Otherwise Valgrind 3.21 is due for release this Friday, 
28th April. I'll be bumping up the devel/valgrind version based on that 
shortly after, so it should be available in a week or two.

Here are further details.

Using a debug build of libc I get (still on FreeBSD 13.1 amd64):


==7225== Possible data race during read of size 1 at 0x4A4F0D0 by thread #4
==7225== Locks held: none
==7225==    at 0x4974680: __sfp (lib/libc/stdio/findfp.c:135)
==7225==    by 0x4974D0B: fopen (lib/libc/stdio/fopen.c:62)
==7225==    by 0x4939363: _sethtent (lib/libc/net/getaddrinfo.c:2381)
==7225==    by 0x4939363: ??? (lib/libc/net/getaddrinfo.c:2502)
==7225==    by 0x494A14C: nsdispatch (lib/libc/net/nsdispatch.c:727)
==7225==    by 0x4937A68: explore_fqdn (lib/libc/net/getaddrinfo.c:1945)
==7225==    by 0x4937A68: getaddrinfo (lib/libc/net/getaddrinfo.c:576)
==7225==    by 0x201A2E: resolve (hak.c:23)
==7225==    by 0x485B756: mythread_wrapper (hg_intercepts.c:406)
==7225==    by 0x4C7F839: ??? (in /lib/libthr.so.3)
==7225==
==7225== This conflicts with a previous write of size 1 by thread #2
==7225== Locks held: none
==7225==    at 0x4974714: __sfp (lib/libc/stdio/findfp.c:146)
==7225==    by 0x4974D0B: fopen (lib/libc/stdio/fopen.c:62)
==7225==    by 0x4939363: _sethtent (lib/libc/net/getaddrinfo.c:2381)
==7225==    by 0x4939363: ??? (lib/libc/net/getaddrinfo.c:2502)
==7225==    by 0x494A14C: nsdispatch (lib/libc/net/nsdispatch.c:727)
==7225==    by 0x4937A68: explore_fqdn (lib/libc/net/getaddrinfo.c:1945)
==7225==    by 0x4937A68: getaddrinfo (lib/libc/net/getaddrinfo.c:576)
==7225==    by 0x201A2E: resolve (hak.c:23)
==7225==    by 0x485B756: mythread_wrapper (hg_intercepts.c:406)
==7225==    by 0x4C7F839: ??? (in /lib/libthr.so.3)
==7225==  Address 0x4a4f0d0 is in the BSS segment of 
/usr/home/paulf/build/src/obj/usr/home/paulf/build/src/amd64.amd64/lib/libc/libc.so.7.full

The code in question is

	STDIO_THREAD_LOCK();
	for (g = &__sglue; g != NULL; g = g->next) {
		for (fp = g->iobs, n = g->niobs; --n >= 0; fp++)
HERE=>			if (fp->_flags == 0)
				goto found;
	}
	STDIO_THREAD_UNLOCK();	/* don't hold lock while malloc()ing. */

and

	STDIO_THREAD_LOCK();	/* reacquire the lock */
	SET_GLUE_PTR(lastglue->next, g); /* atomically append glue to list */
	lastglue = g;	/* not atomic; only accessed when locked */
	fp = g->iobs;
found:
HERE=>	fp->_flags = 1;	/* reserve this slot; caller sets real flags */
	STDIO_THREAD_UNLOCK();


These lock macros use spinlocks. The problem is that Valgrind (both 
Helgrind and DRD) doesn't recognize any locking mechanisms other than 
pthreads and Qt threads.

That means that the Valgrind tools fall back on the suppression 
mechanism for all libthr and libc internal locks like this.

A+
Paul