Undefined reference to __atomic_store_8
Tijl Coosemans
tijl at FreeBSD.org
Wed Aug 12 11:42:04 UTC 2020
On Wed, 12 Aug 2020 09:44:25 +0400 Gleb Popov <arrowd at freebsd.org> wrote:
> On Wed, Aug 12, 2020 at 9:21 AM Gleb Popov <arrowd at freebsd.org> wrote:
>> Indeed, this looks like a culprit! When compiling using first command line
>> (the long one) I get following warnings:
>>
>> /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:369:10:
>> warning: misaligned atomic operation may incur significant performance
>> penalty [-Watomic-alignment]
>> return __atomic_load_n((StgWord64 *) x, __ATOMIC_SEQ_CST);
>> ^
>> /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:417:3:
>> warning: misaligned atomic operation may incur significant performance
>> penalty [-Watomic-alignment]
>> __atomic_store_n((StgWord64 *) x, (StgWord64) val, __ATOMIC_SEQ_CST);
>> ^
>> 2 warnings generated.
>>
>> I guess this basically means "I'm emitting a call there". So, what's the
>> correct fix in this case?
>
> I just noticed that Clang emits these warnings (and the call instruction)
> only for functions handling StgWord64 type. For the same code with
> StgWord32, like
>
> StgWord
> hs_atomicread32(StgWord x)
> {
> #if HAVE_C11_ATOMICS
> return __atomic_load_n((StgWord32 *) x, __ATOMIC_SEQ_CST);
> #else
> return __sync_add_and_fetch((StgWord32 *) x, 0);
> #endif
> }
>
> no warning is emitted as well as no call.
>
> How does clang infer alignment in these cases? What's so special about
> StgWord64?
StgWord64 is uint64_t which is unsigned long long which is 4 byte
aligned on i386. Clang wants 8 byte alignment to use the fildll
instruction.
You could change the definition of the StgWord64 type to look like:
typedef uint64_t StgWord64 __attribute__((aligned(8)));
But this only works if all calls to hs_atomicread64 pass a StgWord64
as argument and not some other 64 bit value.
Another solution I already mentioned in a previous message: replace
HAVE_C11_ATOMICS with 0 in hs_atomicread64 so it uses
__sync_add_and_fetch instead of __atomic_load_n. That uses the
cmpxchg8b instruction which doesn't care about alignment. It's much
slower but I guess 64 bit atomic loads are rare enough that this
doesn't matter much.
More information about the freebsd-toolchain
mailing list