svn commit: r221614 - projects/largeSMP/sys/powerpc/include
Attilio Rao
attilio at freebsd.org
Sun May 8 14:36:24 UTC 2011
2011/5/8 Attilio Rao <attilio at freebsd.org>:
> 2011/5/8 Bruce Evans <brde at optusnet.com.au>:
>> On Sun, 8 May 2011, Attilio Rao wrote:
>>
>>> Log:
>>> All architectures define the size-bounded types (uint32_t, uint64_t,
>>> etc.)
>>> starting from base C types (int, long, etc).
>>> That is also reflected when building atomic operations, as the
>>> size-bounded types are built from the base C types.
>>>
>>> However, powerpc does the inverse thing, leading to a serie of nasty
>>
>> I think you mean that it does the inverse thing for atomic ops only.
>>
>>> bugs.
>>> Cleanup the atomic implementation by defining as base the base C type
>>> version and depending on them, appropriately.
>>
>> I think that it is mostly the other arches that are backwards here,
>> except for plain ints. MI code cannot use atomic ops for anything
>> except plain ints, since no other type is guaranteed to be supported
>> by the MD code. For example, sparc64 doesn't support any 8-bit or
>> 16-bit types, and i386 doesn't support any 64-bit types (until recently;
>> it now uses cmpxchg8b on some CPUs and a hack on other CPUs to support
>> 64, bit types), and my version of i386 doesn't support any 8-bit or
>> 16-bit types or long since these types are unusable in MI code and
>> unused in MD code (long is misused in some MI code but I don't use
>> such broken code).
>>
>> At least one type must be supported, and to keep things sane, int must
>> be supported. The type could be int32_t instead, but then you would have
>> a hard-coded assumption that int is int32_t instead of a hard-coded
>> assumption that int can be an atomic type. The latter is a better
>> assumption for MI code. But MD code can make the former assumption
>> except of course on arches where it is false, but there are none of
>> those yet. This gives the following structure in atomic.h:
>>
>> o basic definitions in terms of uint8/16/32/64_t. But don't define the
>> 8/16/64 bit ones unless they are actually used in MD code, which is
>> rare. MI code cannot use these.
>> o assume that u_int is uint32_t and #define all the atomic_foo_int()
>> interfaces as atomic_foo_32. Similarly for pointers, except use
>> atomic_foo_64 in some cases.
>> o do something for long. MI code cannot use atomic ops on longs, but does.
>> Correctly-sized longs are usually fundamentally non-atomic since they
>> are twice as long as the register width and the register width is usually
>> the same as the bus width and atomicness for widths wider than the bus is
>> unnatural and/or inefficient. But no supported arch has correctly-sized
>> longs.
>
> So the thing is that we define the uintXX_t type from int,long,etc and
> then we should really expect to follow the same pattern for atomic
> instructions.
>
>>> Modified: projects/largeSMP/sys/powerpc/include/atomic.h
>>>
>>> ==============================================================================
>>> --- projects/largeSMP/sys/powerpc/include/atomic.h Sat May 7
>>> 23:34:14 2011 (r221613)
>>> +++ projects/largeSMP/sys/powerpc/include/atomic.h Sun May 8
>>> 00:39:49 2011 (r221614)
>>> @@ -48,13 +48,7 @@
>>> * { *p += v; }
>>> */
>>>
>>> -#define __ATOMIC_ADD_8(p, v, t) \
>>> - 8-bit atomic_add not implemented
>>> -
>>> -#define __ATOMIC_ADD_16(p, v, t) \
>>> - 16-bit atomic_add not implemented
>>> -
>>
>> Already correctly left out except for bogusly #defining it and #undefining
>> it. Now not bogusly #defined either.
>
> I don't see the point to have something like that, also because is
> never used in the kernel.
>
>>> -#define __ATOMIC_ADD_32(p, v, t) \
>>> +#define __atomic_add_int(p, v, t) \
>>> __asm __volatile( \
>>> "1: lwarx %0, 0, %2\n" \
>>> " add %0, %3, %0\n" \
>>> @@ -63,10 +57,10 @@
>>> : "=&r" (t), "=m" (*p) \
>>> : "r" (p), "r" (v), "m" (*p) \
>>> : "cc", "memory") \
>>> - /* __ATOMIC_ADD_32 */
>>> + /* __atomic_add_int */
>>
>> No type problems at this level.
>
> Yes, but I also want uniformity in the implementation, thus try to
> give a common pattern.
>
>>> #ifdef __powerpc64__
>>> -#define __ATOMIC_ADD_64(p, v, t) \
>>> +#define __atomic_add_long(p, v, t) \
>>> __asm __volatile( \
>>> "1: ldarx %0, 0, %2\n" \
>>> " add %0, %3, %0\n" \
>>> @@ -75,69 +69,78 @@
>>> : "=&r" (t), "=m" (*p) \
>>> : "r" (p), "r" (v), "m" (*p) \
>>> : "cc", "memory") \
>>> - /* __ATOMIC_ADD_64 */
>>> + /* __atomic_add_long */
>>> #else
>>> -#define __ATOMIC_ADD_64(p, v, t) \
>>> - 64-bit atomic_add not implemented
>>
>> Now correctly left out when it is not supported.
>
> See below, for non powerpc64 case it must not be used.
>
>>> +#define __atomic_add_long(p, v, t) \
>>> + long atomic_add not implemented
>>> #endif
>>
>> atomic_add_long() should never be supported, but it is (mis)used in
>> standard kernel files so not having it will break more than just LINT.
>> But the above is not atomic_add_long()...hacks are used below to make
>> that sort of work.
>
> In which sense "should never be supported"?
> If you are on a 64 bit arch and you want and atomic operation for long
> you are supposed to use 64-bit version directly? I don't get the
> point.
>
>>> +#define atomic_add_acq_64 atomic_add_acq_long
>>> +#define atomic_add_rel_64 atomic_add_rel_long
>>> +
>>> +#define atomic_add_ptr(p, v) \
>>> + atomic_add_long((volatile u_long *)(p), (u_long)(v))
>>> +#define atomic_add_acq_ptr(p, v) \
>>> + atomic_add_acq_long((volatile u_long *)(p), (u_long)(v))
>>> +#define atomic_add_rel_ptr(p, v) \
>>> + atomic_add_rel_long((volatile u_long *)(p), (u_long)(v))
>>
>> This uses bogus casts to break warnings. amd64 is still missing this bug.
>> Someone added this bug to i386 after reviewers objected to it. Even i386
>> only has this bug for accesses to pointers. For char/short/long, it uses
>> type-safe code with separate functions/#defines like the ones for int.
>
> I'll probabilly may have get from the i386 version, I'll verify, thanks.
>
>> The casts for pointers break jhb's intentions that:
>> - atomic accesses to pointers should be rare
>> - when they are needed, the caller must cast
>>
>> I don't know why we don't have separate functions/#defines for pointers,
>> but like not having the bloat for it.
>
> Nice history, I didn't know that explicitly, but I'll check that later
> along with the cast thing.
>
>>> +#else
>>> +#define atomic_add_long(p, v) \
>>> + atomic_add_int((volatile u_int *)(p), (u_int)(v))
>>> +#define atomic_add_acq_long(p, v) \
>>> + atomic_add_acq_int((volatile u_int *)(p), (u_int)(v))
>>> +#define atomic_add_rel_long(p, v) \
>>> + atomic_add_rel_int((volatile u_int *)(p), (u_int)(v))
>>> +
>>> +#define atomic_add_ptr(p, v) \
>>> + atomic_add_int((volatile u_int *)(p), (u_int)(v))
>>> +#define atomic_add_acq_ptr(p, v) \
>>> + atomic_add_acq_int((volatile u_int *)(p), (u_int)(v))
>>> +#define atomic_add_rel_ptr(p, v) \
>>> + atomic_add_rel_int((volatile u_int *)(p), (u_int)(v))
>>> +#endif
>>
>> Now you need the bogus casts for the longs, since although atomic
>> accesses to longs should be rarer than ones to pointers (never on MI code),
>> it is not intended that the caller must bogusly cast for them, especially
>> since the bogus casts are MD.
>>
>> The bogus casts for the pointers should have no effect except to break
>> warnings in new code, since at least old MI code must already have the
>> casts in callers else it wouldn't compile on amd64.
>>
>>> [... often I snip bits without noting this as here]
>>> -#if 0
>>> -_ATOMIC_CLEAR(8, 8, uint8_t)
>>> -_ATOMIC_CLEAR(8, char, u_char)
>>> -_ATOMIC_CLEAR(16, 16, uint16_t)
>>> -_ATOMIC_CLEAR(16, short, u_short)
>>> -#endif
>>> -_ATOMIC_CLEAR(32, 32, uint32_t)
>>> -_ATOMIC_CLEAR(32, int, u_int)
>>> -#ifdef __powerpc64__
>>> -_ATOMIC_CLEAR(64, 64, uint64_t)
>>> -_ATOMIC_CLEAR(64, long, u_long)
>>> -_ATOMIC_CLEAR(64, ptr, uintptr_t)
>>> -#else
>>> -_ATOMIC_CLEAR(32, long, u_long)
>>> -_ATOMIC_CLEAR(32, ptr, uintptr_t)
>>> -#endif
>>
>> Seems a better way, and still used on amd64 and i386, to generate
>> separate type-safe code for each type supported. Here the only error
>> relative to amd64 and i386 seems to have been to generate from long
>> to a 32/64 suffix and #define from that back to a long suffix, instead
>> of to generate from long to a long suffix and #define from that to a
>> 32/64 suffix.
>
> I'm a bit confused on what you are stating here exactly... are in
> favor of this change or not?
>
>>> ...
>>> @@ -622,39 +638,59 @@ atomic_cmpset_rel_long(volatile u_long *
>>> return (atomic_cmpset_long(p, cmpval, newval));
>>> }
>>>
>>> -#define atomic_cmpset_acq_int atomic_cmpset_acq_32
>>> -#define atomic_cmpset_rel_int atomic_cmpset_rel_32
>>> -
>>> -#ifdef __powerpc64__
>>> -#define atomic_cmpset_acq_ptr(dst, old, new) \
>>> - atomic_cmpset_acq_long((volatile u_long *)(dst), (u_long)(old),
>>> (u_long)(new))
>>> -#define atomic_cmpset_rel_ptr(dst, old, new) \
>>> - atomic_cmpset_rel_long((volatile u_long *)(dst), (u_long)(old),
>>> (u_long)(new))
>>> -#else
>>> -#define atomic_cmpset_acq_ptr(dst, old, new) \
>>> - atomic_cmpset_acq_32((volatile u_int *)(dst), (u_int)(old),
>>> (u_int)(new))
>>> -#define atomic_cmpset_rel_ptr(dst, old, new) \
>>> - atomic_cmpset_rel_32((volatile u_int *)(dst), (u_int)(old),
>>> (u_int)(new))
>>> +#define atomic_cmpset_64 atomic_cmpset_long
>>> +#define atomic_cmpset_acq_64 atomic_cmpset_acq_long
>>> +#define atomic_cmpset_rel_64 atomic_cmpset_rel_long
>>> +
>>> +#define atomic_cmpset_ptr(dst, old, new)
>>> \
>>
>> No bogus casts for these now.
>>
>>> + atomic_cmpset_long((volatile u_long *)(dst), (u_long)(old), \
>>> + (u_long)(new))
>>> +#define atomic_cmpset_acq_ptr(dst, old, new)
>>> \
>>> + atomic_cmpset_acq_long((volatile u_long *)(dst), (u_long)(old), \
>>> + (u_long)(new))
>>> +#define atomic_cmpset_rel_ptr(dst, old, new)
>>> \
>>> + atomic_cmpset_rel_long((volatile u_long *)(dst), (u_long)(old), \
>>> + (u_long)(new))
>>
>> These bogus casts seem more bogus than before -- they seem to do nothing
>> except break type safety, since you already have functions with the right
>> suffixes and types (except the suffix says `long' but only u_longs are
>> supported).
>>
>>> +#else
>>> +#define atomic_cmpset_long(dst, old, new)
>>> \
>>> + atomic_cmpset_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>> +#define atomic_cmpset_acq_long(dst, old, new)
>>> \
>>> + atomic_cmpset_acq_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>> +#define atomic_cmpset_rel_long(dst, old, new)
>>> \
>>> + atomic_cmpset_rel_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>
>> Usual bogus casts to type-pun from long to int in the 32-bit case. Even
>> here, the ones on the non-pointer args have no useful effect.
>>
>>> +
>>> +#define atomic_cmpset_ptr(dst, old, new)
>>> \
>>> + atomic_cmpset_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>> +#define atomic_cmpset_acq_ptr(dst, old, new)
>>> \
>>> + atomic_cmpset_acq_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>> +#define atomic_cmpset_rel_ptr(dst, old, new)
>>> \
>>> + atomic_cmpset_rel_int((volatile u_int *)(dst), (u_int)(old), \
>>> + (u_int)(new))
>>> #endif
>>
>> Usual bogus casts for the pointer case.
>>
>>> #ifdef __powerpc64__
>>> -static __inline uint64_t
>>> -atomic_fetchadd_64(volatile uint64_t *p, uint64_t v)
>>> +static __inline u_long
>>> +atomic_fetchadd_long(volatile u_long *p, u_long v)
>>> {
>>> - uint64_t value;
>>> + u_long value;
>>>
>>> do {
>>> value = *p;
>>> @@ -662,10 +698,10 @@ atomic_fetchadd_64(volatile uint64_t *p,
>>> return (value);
>>> }
>>>
>>> -#define atomic_fetchadd_long atomic_fetchadd_64
>>> +#define atomic_fetchadd_64 atomic_fetchadd_long
>>> #else
>>> -#define atomic_fetchadd_long(p, v) \
>>> - (u_long)atomic_fetchadd_32((volatile u_int *)(p), (u_int)(v))
>>> +#define atomic_fetchadd_long(p, v)
>>> \
>>> + (u_long)atomic_fetchadd_int((volatile u_int *)(p), (u_int)(v))
>>
>> i386 uses a function to type-pun from fetchadd_long to fetchadd_int().
>> This has minor technical advantages. Why be different?
>
> I'm not entirely sure what you mean with 'function to type-pun' but
> will look at that as well, thanks.
>
I think I got what you mean with "type-pun" version of long and I'll
fix as expected, thanks.
On a related note, in previous e-mail you just noticed the lenghty of
atomic in powerpc.
I just wanted to let you note that atomic.h needs to actually cater 2
different type of architectures (keeping the support of both powerpc
and powerpc64).
sparc64, amd64 and i386 don't have these problem.
Attilio
--
Peace can only be achieved by understanding - A. Einstein
More information about the svn-src-projects
mailing list