Packet steering/SMP
Robert Watson
rwatson at FreeBSD.org
Tue Aug 3 14:22:09 UTC 2010
On Mon, 2 Aug 2010, Brett Glass wrote:
> http://www.computerworld.com/s/article/9180022/Latest_Linux_kernel_uses_Google_made_protocols
>
> describes SMP optimizations to the Linux kernel (the article mistakenly
> calls them "protocols," but they're not) which steer the processing of
> incoming network packets to the CPU core that is running the process for
> which they're destined. (Doing this requires code which straddles network
> layers in interesting ways.) The article claims that these optimizations are
> Google's invention, though they simply seem like a common sense way to make
> the best use of CPU cache.
>
> The article claims dramatic performance improvements due to this
> optimization. Anything like this in the works for FreeBSD?
Quite a few systems do things like this, although perhaps not the exact
formula that Google has. For example, Solarflare's TCP onload engine
programms their hardware to direct5-tuples to specific queues for use by
specific processes. Likewise, Chelsio's recenetly committed TCAM programming
code allows work to be similarly directed to specific queues (and generally
CPUs), although not in a way tightly integrated with the network stack.
I'm currently doing some work for Juniper to add affinity features up and down
the stack. Right now my prototype does this with RSS but doesn't attempt to
expose specific flow affinity to userspace, or allow userspace to direct
affinity. I have some early hacks at socket options to do that, although my
goal was to perform flow direction in hardware (i.e., have the network stack
program the TCAM on the T3 cards) rather than do the redirection in software.
However, some recent experiments I ran that did work distribution to the
per-CPU netisr workers I added in FreeBSD 8 were surprisingly effective -- not
as good as distribution in hardware, but still significantly more throughput
on an 8-core system (in this case I used RSS hashes generated by the
hardware).
Adding some sort of software redirection affinity table wouldn't be all that
difficult, but I'll continue to focus on hardware distribution for the time
being -- several cards out there will support the model pretty nicely. The
only real limitations there are (a) which cards support it -- not Intel NICs,
I'm afraid and (b) the sizes of the hardware flow direction tables -- usually
in the thousands to tends of thousands range.
Robert
More information about the freebsd-chat
mailing list