Kernelspace C11 atomics for MIPS
Warner Losh
imp at bsdimp.com
Mon Jun 3 17:53:19 UTC 2013
On Jun 3, 2013, at 8:04 AM, Ed Schouten wrote:
> Hi,
>
> As of r251230, it should be possible to use C11 atomics in
> kernelspace, by including <sys/stdatomic.h>! Even when not using Clang
> (but GCC 4.2), it is possible to use quite a large portion of the API.
> A couple of limitations:
>
> - The memory order argument is simply ignored, making all the calls do
> a full memory barrier.
> - At least Clang allows you to do arithmetic on C11 atomics directly
> (e.g. "a += 5" == "atomic_fetch_add(&a, 5)"), which is of course not
> possible to mimick.
> - The atomic functions only work on 1,2,4,8-byte types, which is
> probably a good thing.
>
> Amazingly, it turns out that it most of the architectures, with the
> exception of ARM and MIPS. To make MIPS work, we need to implement
> some of the __sync_* functions that are described here:
>
> http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
>
> Some time ago I already added some of these functions to our
> libcompiler-rt in userspace, to make atomics work there.
> Unfortunately, these functions were quite horribly implemented, as I
> tried to build them on top of <machine/atomic.h>, which is far from
> trivial/efficient. It is also restricted to 4 and 8-byte types. That's
> why I thought: why not spend some time learning MIPS assembly and
> write some decent implementations for these functions?
>
> The result:
>
> http://80386.nl/pub/mips-stdatomic.txt
The number of necessary syncs varies by processor type. There's also newer synchronization instructions that make this as efficient as possible for all mips32r2 and mips64r2-based machines. Older Caviums, at least and maybe newer ones, also have their own variants. What you have will mostly work for the processors we have to support. mips_sync could therefore be better. Doing it before AND after seems like overkill as well. Since sync is a fairly performance killing assembler instruction, how would you feel about allowing optimizations?
This is my biggest single concern about the patch, but it also my current biggest concern about the MIPS atomic operators in general.
> For now, please focus on sys/mips/mips/stdatomic.c. It implements all
> the __sync_* functions called by <stdatomic.h> for 1, 2, 4 and 8 byte
> types. There is some testing code in there as well, which can be
> ignored. This code disassembles to the following:
>
> http://80386.nl/pub/mips-stdatomic-disasm.txt
>
> As I don't own a MIPS system myself, I was thinking about tinkering a
> bit with qemu to see whether these functions work properly. My
> questions are:
>
> - Does anyone have any comments on the C code and/or the machine code
> generated? Are there some nifty tricks I can apply to make the machine
> code more efficient that I am unaware o?
> - Is there anyone interested in testing this code a bit more
> thoroughly on physical hardware?
> - Would anyone mind if I committed this to HEAD?
I have some cavium gear I can easily test on, and some other stuff I can less-easily test on.
It wouldn't be horrible to commit to head, but it would affect performance in many places.
Don't commit the kern/bla.c standard change to conf/files, it looks to be bogus :)
Warner
More information about the freebsd-arch
mailing list