using SSE2 in kernel C code (improving AES-NI module)

John-Mark Gurney jmg at funkthat.com
Fri Oct 19 23:38:34 UTC 2012


So, the AES-NI module already uses SSE2 instructions, but it does so
only in assembly.  I have improved the perofrmance of the AES-NI
modules implementation, but this involves me using additional SSE2
instructions.

In order to keep my sanity, I did part of the new code in C using
gcc native types and xmmintrin.h, but we do not support this header in
the kernel..  This means we cannot simply add the new code to the
kernel...

Any good ideas on how to integrate this code into the kernel build?

I have used the trick of producing assembly of the C file with gcc -S,
and then compiling the assembly into the kernel, but I'm not sure if
that's the best way, and even if it is the best, how I'd do the
generation as part of the kernel build...  Or would it be ok to commit
both, and require a regeneration each time the C file is updated?

In my testing in userland w/o the opencrypto framework overhead, the old
code would only get about ~250MB/sec..  With the new code I get
~2200MB/sec...

Sample code:
static inline __m128i
xts_crank_lfsr(__m128i inp)
{
	const __m128i alphamask = _mm_set_epi32(1, 1, 1, AES_XTS_ALPHA);
	__m128i xtweak, ret;

	/* set up xor mask */
	xtweak = _mm_shuffle_epi32(inp, 0x93);
	xtweak = _mm_srai_epi32(xtweak, 31);
	xtweak &= alphamask;

	/* next term */
	ret = _mm_slli_epi32(inp, 1);
	ret ^= xtweak;

	return ret;
}

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."


More information about the freebsd-arch mailing list