Packet steering/SMP

Robert Watson rwatson at FreeBSD.org
Tue Aug 3 14:22:09 UTC 2010


On Mon, 2 Aug 2010, Brett Glass wrote:

> http://www.computerworld.com/s/article/9180022/Latest_Linux_kernel_uses_Google_made_protocols
>
> describes SMP optimizations to the Linux kernel (the article mistakenly 
> calls them "protocols," but they're not) which steer the processing of 
> incoming network packets to the CPU core that is running the process for 
> which they're destined. (Doing this requires code which straddles network 
> layers in interesting ways.) The article claims that these optimizations are 
> Google's invention, though they simply seem like a common sense way to make 
> the best use of CPU cache.
>
> The article claims dramatic performance improvements due to this 
> optimization. Anything like this in the works for FreeBSD?

Quite a few systems do things like this, although perhaps not the exact 
formula that Google has.  For example, Solarflare's TCP onload engine 
programms their hardware to direct5-tuples to specific queues for use by 
specific processes.  Likewise, Chelsio's recenetly committed TCAM programming 
code allows work to be similarly directed to specific queues (and generally 
CPUs), although not in a way tightly integrated with the network stack.

I'm currently doing some work for Juniper to add affinity features up and down 
the stack.  Right now my prototype does this with RSS but doesn't attempt to 
expose specific flow affinity to userspace, or allow userspace to direct 
affinity.  I have some early hacks at socket options to do that, although my 
goal was to perform flow direction in hardware (i.e., have the network stack 
program the TCAM on the T3 cards) rather than do the redirection in software. 
However, some recent experiments I ran that did work distribution to the 
per-CPU netisr workers I added in FreeBSD 8 were surprisingly effective -- not 
as good as distribution in hardware, but still significantly more throughput 
on an 8-core system (in this case I used RSS hashes generated by the 
hardware).

Adding some sort of software redirection affinity table wouldn't be all that 
difficult, but I'll continue to focus on hardware distribution for the time 
being -- several cards out there will support the model pretty nicely.  The 
only real limitations there are (a) which cards support it -- not Intel NICs, 
I'm afraid and (b) the sizes of the hardware flow direction tables -- usually 
in the thousands to tends of thousands range.

Robert


More information about the freebsd-chat mailing list