Re: Is powerpc64 atomic_load_acq_##TYPE omitting isync believed correct?

From: Mark Millard via freebsd-hackers <freebsd-hackers_at_freebsd.org>
Date: Mon, 31 May 2021 11:23:04 UTC

On 2021-May-29, at 23:04, Mark Millard <marklmi at yahoo.com> wrote:

> In the code from /usr/include/machine/atomic.h for powerpc64
> and powerpc there is:
> 
> #define ATOMIC_STORE_LOAD(TYPE)                                 \
> static __inline u_##TYPE                                        \
> atomic_load_acq_##TYPE(volatile u_##TYPE *p)                    \
> {                                                               \
>        u_##TYPE v;                                             \
>                                                                \
>        v = *p;                                                 \
>        powerpc_lwsync();                                       \
>        return (v);                                             \
> }                                                               \
>                                                                \
> static __inline void                                            \
> atomic_store_rel_##TYPE(volatile u_##TYPE *p, u_##TYPE v)       \
> {                                                               \
>                                                                \
>        powerpc_lwsync();                                       \
>        *p = v;                                                 \
> }
> 
> This code sequence does not involve isync:
> 
> #define __ATOMIC_ACQ()  __asm __volatile("isync" : : : "memory")
> 
> What justifies this? All the reference material I've
> found for C++/C11 semantics agrees with:
> 
> https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
> 
> that shows (organized here to compare Relaxed vs.
> Acquire and Release):
> 
> powerpc Load  Relaxed vs. Acquire: ld vs. ld;cmp;bc;isync
> powerpc Fence:            Acquire: lwsync
> powerpc Store Relaxed vs. Release: st vs. "Fence: Release";st
> powerpc Fence:            Release: lwsync
> 
> lwsync does not order prior stores vs. later loads, isync does
> (and more in some respects). That likely (partially) explains
> why load-acquire does not use just an acquire-fence in such
> materials.
> 
> Is this a problem for being correct for "synchronizes with" in
> "man atomic"? For the acquire operation reading the value
> written by the release operation:
> 
> QUOTE
>     . . . the effects of all
>     prior stores by the releasing thread must become visible to subsequent
>     loads by the acquiring thread
> END QUOTE
> 
> It seems that some later loads could be moved by the hardware
> to be too early relative to various such prior stores (as seen
> in the load-acquire thread): no constraint is placed for such
> relationships by the atomic_load_acq_##TYPE as far as I can see.
> 
> 
> (I got into this by finding some code that uses an
> atomic_store_rel_##TYPE without any matching use of
> atomic_load_acq_##TYPE or atomic_thread_fence_acq or other such,
> so far as I found. But, looking around to see if I could find a
> justification for such code, generated more questions, such as
> in this note.)

Never mind. I figured out my significant confusion in
interpretation. (Net result: lwsync is more than
sufficient.)


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)