FreeBSD I/OAT (QuickData now?) driver
Robert Watson
rwatson at FreeBSD.org
Sat Jun 11 15:49:18 UTC 2011
On Mon, 6 Jun 2011, grarpamp wrote:
> I know we've got polling. And probably MSI-X in a couple drivers. Pretty
> sure there is still one CPU doing the interrupt work? And none of the
> multiple queue thread spreading tech exists?
Actually, with most recent 10gbps cards, and even 1gbps cards, we process
inbound data with as many CPUs as the hardware has MSI-X enabled input and
output queues. So "a couple" understates things significantly.
> * Through PF_RING, expose the RX queues to the userland so that
> the application can spawn one thread per queue hence avoid using
> semaphores at all.
I'm probably a bit out of date, but last I checked, PF_RING still implied
copying, albeit into shared memory buffers. We support shared memory between
the kernel and userspace for BPF and have done for quite a while. However,
right now a single shared memory buffer is shared for all receive queues on a
NIC. We have a Google summer of code student working on this actively right
now -- my hope is that by the end of the summer we'll have a pretty functional
system that allows different shared memory buffers to be used for different
input queues. In particular, applications will be able to query the set of
queues available, detect CPU affinity for them, and bind particular shared
memory rings to particular queues. It's worth observing that for many types
of high-performance analysis, BPF's packet filtering and truncation support is
quite helpful, and if you're going to use multiple hardware threads per input
queue anyway, you actually get a nice split this way (as long as those threads
share L2 caches).
Luigi's work on mapping receive rings straight into userspace looks quite
interesting, but I'm pretty behind currently, so haven't had a chance to read
his NetMap paper. The direct mapping of rings approach is what a number of
high-performance FreeBSD shops have been doing for a while, but none had
generalised it sufficiently to merge into our base stack. I hope to see this
happen in the next year.
Robert
More information about the freebsd-net
mailing list