Flow ID, LACP, and igb
Barney Cordoba
barney_cordoba at yahoo.com
Mon Sep 2 12:47:25 UTC 2013
Are you using a pcie3 bus? Of course this is only an issue for 10g; what pct of
FreeBSD users have a load over 9.5Gb/s? It's completely unnecessary for igb
or em driver, so why is it used? because it's there.
Here's my argument against it. The handful of brains capable of doing driver development
become consumed with BS like LRO and the things that need to be fixed, like
buffer management and basic driver design flaws, never get fixed. The offload
code makes the driver code a virtual mess that can only be maintained by Jack and
1 other guy in the entire world. And it takes 10 times longer to make a simple change or
to add support for a new NIC.
In a week I ripped out the offload crap and the 9000 sysctls, eliminated the
"consumer buffer" problem, reduced locking by 40% and now the igb driver
uses 20% less cpu with a full gig load.
And the code is cleaner and more easily maintained.
BC
________________________________
From: Adrian Chadd <adrian at freebsd.org>
To: Barney Cordoba <barney_cordoba at yahoo.com>
Cc: Andre Oppermann <andre at freebsd.org>; Alan Somers <asomers at freebsd.org>; "net at freebsd.org" <net at freebsd.org>; Jack F Vogel <jfv at freebsd.org>; Justin T. Gibbs <gibbs at freebsd.org>; Luigi Rizzo <rizzo at iet.unipi.it>; T.C. Gubatayao <tgubatayao at barracuda.com>
Sent: Sunday, September 1, 2013 4:51 PM
Subject: Re: Flow ID, LACP, and igb
Yo,
LRO is an interesting hack that seems to do a good trick of hiding the
ridiculous locking and unfriendly cache behaviour that we do per-packet.
It helps with LAN test traffic where things are going out in batches from
the TCP layer so the RX layer "sees" these frames in-order and can do LRO.
When you disable it, I don't easily get 10GE LAN TCP performance. That has
to be fixed. Given how fast the CPU cores, bus interconnect and memory
interconnects are, I don't think there should be any reason why we can't
hit 10GE traffic on a LAN with LRO disabled (in both software and hardware.)
Now that I have the PMC sandy bridge stuff working right (but no PEBS, I
have to talk to Intel about that in a bit more detail before I think about
hacking that in) we can get actual live information about this stuff. But
the last time I looked, there's just too much per-packet latency going on.
The root cause looks like it's a toss up between scheduling, locking and
just lots of code running to completion per-frame. As I said, that all has
to die somehow.
2c,
-adrian
On 1 September 2013 08:45, Barney Cordoba <barney_cordoba at yahoo.com> wrote:
>
>
> Comcast sends packets OOO. With any decent number of internet hops you're
> likely to encounter a load
> balancer or packet shaper that sends packets OOO, so you just can't be
> worried about it. In fact, your
> designs MUST work with OOO packets.
>
> Getting balance on your load balanced lines is certainly a bigger upside
> than the additional CPU used.
> You can buy a faster processor for your "stack" for a lot less than you
> can buy bandwidth.
>
> Frankly my opinion of LRO is that it's a science project suitable for labs
> only. It's a trick to get more bandwidth
> than your bus capacity; the answer is to not run PCIe2 if you need pcie3.
> You can use it internally if you have
> control of all of the machines. When I modify a driver the first thing
> that I do is rip it out.
>
> BC
>
>
> ________________________________
> From: Luigi Rizzo <rizzo at iet.unipi.it>
> To: Barney Cordoba <barney_cordoba at yahoo.com>
> Cc: Andre Oppermann <andre at freebsd.org>; Alan Somers <asomers at freebsd.org>;
> "net at freebsd.org" <net at freebsd.org>; Jack F Vogel <jfv at freebsd.org>;
> Justin T. Gibbs <gibbs at freebsd.org>; T.C. Gubatayao <
> tgubatayao at barracuda.com>
> Sent: Saturday, August 31, 2013 10:27 PM
> Subject: Re: Flow ID, LACP, and igb
>
>
> On Sun, Sep 1, 2013 at 4:15 AM, Barney Cordoba <barney_cordoba at yahoo.com
> >wrote:
>
> > ...
> >
>
> [your point on testing with realistic assumptions is surely a valid one]
>
>
> >
> > Of course there's nothing really wrong with OOO packets. We had this
> > discussion before; lots of people
> > have round robin dual homing without any ill effects. It's just not an
> > issue.
> >
>
> It depends on where you are.
> It may not be an issue if the reordering is not large enough to
> trigger retransmissions, but even then it is annoying as it causes
> more work in the endpoint -- it prevents LRO from working, and even
> on the host stack it takes more work to sort where an out of order
> segment goes than appending an in-order one to the socket buffer.
>
> cheers
> luigi
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
_______________________________________________
freebsd-net at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list