atomic ops

Wed Oct 29 18:03:54 UTC 2014

On Wed, 2014-10-29 at 13:35 -0400, John Baldwin wrote:
> On Wednesday, October 29, 2014 12:58:15 pm Ian Lepore wrote:
> > On Wed, 2014-10-29 at 10:59 -0400, John Baldwin wrote:
> > > Eh, that isn't broken.  It is subtle however.  The reason it isn't broken
> > > is that if any access to P occurs afer the 'load P', then the store will
> > > fail and the load-acquire will be retried, if A was accessed during the
> > > atomi op, the load-acquire during the try will discard that and force A
> > > to be re-accessed.  If P is not accessed during the atomic op, then it is
> > > safe to access A during the atomic op itself.
> > > 
> > 
> > I'm not sure I completely agree with all of this. 
> > 
> > First, for 
> > 
> >         if any access to P occurs afer the 'load P', then the store will
> >         fail and the load-acquire will be retried
> > 
> > The term 'access' needs to be changed to 'store'.  Other read accesses
> > to P will not cause the store-exclusive to fail.
> 
> Correct, though for the places where acquire is used I believe that is ok.
> Certainly for lock cookies it is ok.  It's writes to the lock cookie that
> would invalidate 'A'.
> 
> > Next, when we consider 'Access A' I'm not sure it's true that the access
> > will replay if the store-exclusive fails and the operation loops.  The
> > access to A may have been a prefetch, even a prefetch for data on a
> > predicted upcoming execution branch which may or may not end up being
> > taken.
> > 
> > I think the only think that makes an ldrex/strex sequence safe for use
> > in implementing synchronization primitives is to insert a 'dmb' after
> > the acquire loop (after the strex succeeds), and 'dsb' before the
> > release loop (dsb is required for SMP, dmb might be good enough on UP).
> > 
> > Looking into this has made me realize our current armv6/7 atomics are
> > incorrect in this regard.  Guess I'll see about fixing them up Real Soon
> > Now.  :)
> 
> I'm not actually sure either, but it would be surprising to me otherwise.
> Presumably there is nothing magic about a branch.  Either the load-acquire
> is an acquire barrier or it isn't.  Namely, suppose you had this sequence:
> 
> 	load-acquire P
> 	access A (prefetch)
> 	load-acquire Q
> 	load A
> 
> Would you expect the prefetch to satisfy the load or should the load-acquire
> on Q discard that?  Having a branch after a failing conditional store back
> to the load acquire should work similarly.  It has to discard anything that
> was prefetched or it isn't an actual load-acquire.
> 
> That is consider:
> 
> 1:
> 	load-acquire P
> 	access A (prefetch)
> 	conditonal-store P
> 	branch-if-fail 1b
> 	load A
> 
> In the case that the branch fails, the sequence of operations is:
> 
> 	load-acquire P
> 	access A (prefetch)
> 	conditional-store P
> 	branch
> 	load-acquire P
> 
> That should be equivalent to the first sequence above unless the branch
> instruction has the magical property of disabling memory barriers on the
> instruction after a branch (which would be insane).
> 

I hadn't realized it when I wrote that, but Andy was speaking in the
context of armv8, which has a true load-acquire instruction.  In our
current code (armv6 and 7) we need the explicit dmb/dsb barriers to get
the same effect.  (It turns out we do have barriers, I misspoke earlier,
but some of our dmb need to be dsb.)

-- Ian