atomic ops
John Baldwin
jhb at freebsd.org
Thu Oct 30 19:05:47 UTC 2014
On Thursday, October 30, 2014 2:10:48 pm Andrew Turner wrote:
> On Wed, 29 Oct 2014 13:35:57 -0400
> John Baldwin <jhb at freebsd.org> wrote:
> > On Wednesday, October 29, 2014 12:58:15 pm Ian Lepore wrote:
> > > Next, when we consider 'Access A' I'm not sure it's true that the
> > > access will replay if the store-exclusive fails and the operation
> > > loops. The access to A may have been a prefetch, even a prefetch
> > > for data on a predicted upcoming execution branch which may or may
> > > not end up being taken.
> > >
> > > I think the only think that makes an ldrex/strex sequence safe for
> > > use in implementing synchronization primitives is to insert a 'dmb'
> > > after the acquire loop (after the strex succeeds), and 'dsb' before
> > > the release loop (dsb is required for SMP, dmb might be good enough
> > > on UP).
> > >
> > > Looking into this has made me realize our current armv6/7 atomics
> > > are incorrect in this regard. Guess I'll see about fixing them up
> > > Real Soon Now. :)
> >
> > I'm not actually sure either, but it would be surprising to me
> > otherwise. Presumably there is nothing magic about a branch. Either
> > the load-acquire is an acquire barrier or it isn't. Namely, suppose
> > you had this sequence:
> >
> > load-acquire P
> > access A (prefetch)
> > load-acquire Q
> > load A
> >
> > Would you expect the prefetch to satisfy the load or should the
> > load-acquire on Q discard that? Having a branch after a failing
> > conditional store back to the load acquire should work similarly. It
> > has to discard anything that was prefetched or it isn't an actual
> > load-acquire.
>
> I have checked with someone in ARM. The prefetch should not be
> considered an access with regard to the barrier and it could be moved
> before it as it will only load data into the cache. The barrier only
> deals with loading data into the core, i.e. if it has was part of the
> prefetch it will be loaded from the cache no earlier than the
> load-acquire. The cache coherency protocol ensures the data will be up
> to date while the barrier will ensure the ordering of the load of A.
>
> In the above example the prefetch of A will not be thrown away but the
> data in the cache may change between the prefetch and load A if another
> core has written to A. If this is the case the load will be of the new
> data.
That is sufficient for what atomic(9)'s _acq wants, yes.
--
John Baldwin
More information about the freebsd-arch
mailing list