Much improved sosend_*() functions
Andrew Gallatin
gallatin at cs.duke.edu
Fri Sep 29 15:45:34 PDT 2006
Andre Oppermann writes:
> Andrew Gallatin wrote:
> > Andre,
> >
> > I meant to ask: Did you try 16KB jumbos? Did they perform
> > any better than page-sized jumbos?
>
> No, I didn't try 16K jumbos. The problem with anything larger than
> page size is that it may look contigous in kernel memory but isn't
> in physical memory. Thus you need the same number of descriptors
> for the network card as with page sized (4K) clusters.
But it would allow you to do one copyin, rather than 4. I
don't know how much this would help, but it might be worth
looking at.
> > Also, if we're going to change how mbufs work, let's add something
> > like Linux's skb_frag_t frags[MAX_SKB_FRAGS]; In FreeBSD parlence,
> > this embeds something like an array of sf_bufs pointers in mbuf. The
> > big difference to a chain of M_EXT mbufs is that you need to allocate
> > only one mbuf wrapper, rather than one for each item in the list.
> > Also, the reference is kept in the page (or sf_buf) itself, and the
> > data offset is kept in the skbbuf (or mbuf).
>
> We are not going to change how mbufs work.
>
> > This allows us to do cool things like allocate a single page, and use
> > both halves of it for 2 separate 1500 byte frames. This allows us to
> > achieve *amazing* results in combination with LRO, because it allows
> > us to do, on average, many fewer allocations per byte. Especially in
> > combination with Linux's "high order" page allocations. Using order-2
> > allocations and LRO, I've actually seen 10GbE line rate receives on a
> > wimpy 2.0GHz Athlon64.
>
> I have just started tackling the receive path. Lets see what comes out
> of it first before we jump to conclusions.
It could be mbufs are cheaper to get than skbs and pages on linux,
but I doubt it. FWIW, linux has an skb chaining mechanism
(frag_list). My first LRO experiment was based on allocating "normal"
skbs and chaining them. That maxed out at around 5.2Gb/s (on the same
hardware I see line rate on).
Drew
More information about the freebsd-net
mailing list