what goes wrong with barrier free atomic_load/store?
Stephan Uphoff
ups at tree.com
Wed Apr 20 15:13:16 PDT 2005
On Wed, 2005-04-20 at 16:39, John Giacomoni wrote:
> in reading /src/sys/i386/include/atomic.h
>
> I found this comment and I'm having trouble understanding what the
> problem being
> referred to below is.
>
> /*
> * We assume that a = b will do atomic loads and stores. However, on a
> * PentiumPro or higher, reads may pass writes, so for that case we have
> * to use a serializing instruction (i.e. with LOCK) to do the load in
> * SMP kernels. For UP kernels, however, the cache of the single
> processor
> * is always consistent, so we don't need any memory barriers.
> */
>
> can someone give me an example of a situation where one needs to use
> memory barriers to ensure "correctness" when doing writes as above?
volatile int status = NOT_READY;
volatile int data = -1;
Thread 1: (CPU 0)
----------
data = 123;
status = READY;
Thread 2: (CPU 1)
---------
if (status == READY) {
my_data = data;
}
Read reordering my the CPUs may cause the following:
Thread 2: out_of_order_read = data;
Thread 1: data = 123;
Thread 1: status = READY;
Thread 2: if (status == READY) {
Thread 2: my_data = out_of_order_read ; /* XXXX Unexpected VALUE */
Basically volatile does not work as expected.
> the examples I can come up with seem to boil down to requiring locks
> or accepting stale values, given that without a synchronization
> mechanism
> one shouldn't expect two processes to act in any specific order.
The problem is that writes from another CPU (or DMA device) can be
observed out of order.
> In my case I can accept reading a stale value so I'm not understanding
> the
> purpose of only having atomic_load/atomic_store wrappers with memory
> barriers.
>
> I saw a brief discussion where someone proposed barrier free load/store
> but
> don't think I saw any resolution.
Do you mean load/store fences?
A load fence could solve the problem above by preventing the out of
order read of the data by thread 2.
I actually found a race condition close to the one mentioned above in
the kernel yesterday. So we may need to add fences real soon or rewrite
the code to use a spin mutex.
Stephan
More information about the freebsd-hackers
mailing list