[patch][lagg] - Set a better granularity and distribution on roundrobin protocol.
Adrian Chadd
adrian at freebsd.org
Mon Jun 23 04:16:21 UTC 2014
...
It's an interesting idea, but doing round robin like that may
introduce out of order packets.
What's the actual problem you're seeing? Are the transmit queues
filling up? Is the distribution with flowid/curcpu not good enough?
Scott saw this happen at Netflix. He added a lagg twiddle to set which
set of bits to care about in the flowid when picking an interface to
choose. The ixgbe hashing was being done on the low x bits, where x is
related to how many CPUs you have (2 CPUs? 1 bit. 8 CPUs? 3 bits.
etc.) lagg was doing the same thing on the same low order set of bits.
He modified lagg so you could pick some new starting point a few bits
up in the flowid to pick a lagg interface with. That fixed the
distribution issue and also kept the in-orderness of it all.
2c,
-a
On 22 June 2014 19:27, Marcelo Araujo <araujobsdport at gmail.com> wrote:
> Hello guys,
>
> I made some changes on roundrobin protocol where from now you can via
> sysctl(8) set a better packets distribution among the interfaces that are
> part of the lagg(4) group.
>
> My motivation for this change was interfaces that use TSO, as example
> ixgbe(4), the performance is terrible, as we can't full fill the TSO buffer
> at once, the throughput drops expressively and we have much more sack
> between hosts.
>
> So, with this patch we can set the number of packets that will be send
> before switch to the next interface.
>
> In my testbed using ixgbe(4), I had a very good performance as you can see
> bellow:
>
> 1) Without patch:
> ------------------------------------------------------------
> Client connecting to 192.168.1.2, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [ 3] local 192.168.1.1 port 32808 connected with 192.168.1.2 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0- 1.0 sec 406 MBytes 3.40 Gbits/sec
> [ 3] 1.0- 2.0 sec 391 MBytes 3.28 Gbits/sec
> [ 3] 2.0- 3.0 sec 406 MBytes 3.41 Gbits/sec
> [ 3] 3.0- 4.0 sec 585 MBytes 4.91 Gbits/sec
> [ 3] 4.0- 5.0 sec 477 MBytes 4.00 Gbits/sec
> [ 3] 5.0- 6.0 sec 429 MBytes 3.60 Gbits/sec
> [ 3] 6.0- 7.0 sec 520 MBytes 4.36 Gbits/sec
> [ 3] 7.0- 8.0 sec 385 MBytes 3.23 Gbits/sec
> [ 3] 8.0- 9.0 sec 414 MBytes 3.48 Gbits/sec
> [ 3] 9.0-10.0 sec 515 MBytes 4.32 Gbits/sec
> [ 3] 0.0-10.0 sec 4.42 GBytes 3.80 Gbits/sec
>
> 2) With patch:
> ------------------------------------------------------------
> Client connecting to 192.168.1.2, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [ 3] local 192.168.1.1 port 10526 connected with 192.168.1.2 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0- 1.0 sec 694 MBytes 5.83 Gbits/sec
> [ 3] 1.0- 2.0 sec 999 MBytes 8.38 Gbits/sec
> [ 3] 2.0- 3.0 sec 1.17 GBytes 10.1 Gbits/sec
> [ 3] 3.0- 4.0 sec 1.34 GBytes 11.5 Gbits/sec
> [ 3] 4.0- 5.0 sec 1.15 GBytes 9.91 Gbits/sec
> [ 3] 5.0- 6.0 sec 1.19 GBytes 10.2 Gbits/sec
> [ 3] 6.0- 7.0 sec 1.08 GBytes 9.23 Gbits/sec
> [ 3] 7.0- 8.0 sec 1.10 GBytes 9.45 Gbits/sec
> [ 3] 8.0- 9.0 sec 1.27 GBytes 10.9 Gbits/sec
> [ 3] 9.0-10.0 sec 1.39 GBytes 12.0 Gbits/sec
> [ 3] 0.0-10.0 sec 11.3 GBytes 9.74 Gbits/sec
>
> So, basically we have a sysctl(8) called "net.link.lagg.rr_packets" where
> we can set the number of packets that will be send before the roundrobin
> move to the next interface.
>
> Any comment and review are very appreciated.
>
> Best Regards,
>
> --
> Marcelo Araujo (__)araujo at FreeBSD.org
> \\\'',)http://www.FreeBSD.org <http://www.freebsd.org/> \/ \ ^
> Power To Server. .\. /_)
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list