10gbps scalability (was: Re: FreeBSD problems and preliminary ways
to solve)
Robert Watson
rwatson at FreeBSD.org
Sat Aug 20 11:38:27 UTC 2011
On Sat, 20 Aug 2011, Lev Serebryakov wrote:
>> Can you honestly say the same about handling line rate packet forwarding
>> for multiple 10G cards?
>
> I agree with you. I've not say, that 10G routing is very important for many
> users. My comment about 10G was answer to statement, that "The niche for
> routers & traffic analysis is still ours.". I wanted to say, that it is so
> may be now, but not for long.
Part of the key here will be reworking things like ipfw(4) and pf(4) to scale
better than they do currently. For pf(4), it's particularly important that we
align hardware work distribution via RSS with state management for TCP
connections. I've been working on this for the base system TCP implementation
over the last few years, and got most of it into 9.x (but not the actual RSS
driver interface, as I wasn't convinced it was a stable KPI in the form I
prototyped it in). Post-9.0, I'll try to get the RSS KPI cleaned up so that
we can merge it and get our device drivers updated.
There's also a related work-in-progress I have that teaches the network stack
how to program NIC filters, usually implemented as TCAMs (Chelsio) or hardware
hash tables (Solarflare) about network stack connection affinity. My plan is
to work on making this substantially more real once the RSS patches are in.
(Those are, themselves, fairly minor: we have connection groups already in
9.0, and the RSS changes simply cause existing software-side hash tables to
align with hardware-side hashing: the tricky bit is a sustainable KPI for
device driver writers).
These are closely related to the issue of userspace networking, which Luigi is
starting to explore with netmap. Ideally, you could use the same NIC for both
kernel network stack stuff and userspace applications, using hardware filters
to decide whether individual packets go to a descriptor ring in the kernel or
userspace. Solarflare's Open Onload is an interesting potential model there,
although perhaps not the exact model we want (they rely on shared network
stacks between kernel and userspace, and for most of our purposes, less
sharing is not only sufficient, but perhaps better).
Robert
More information about the freebsd-arch
mailing list