terrible if_vmx / vmxnet3 rx performance with lro (post iflib)
Patrick Kelsey
pkelsey at freebsd.org
Wed Feb 26 05:07:53 UTC 2020
On Mon, Feb 24, 2020 at 11:40 PM Patrick Kelsey <pkelsey at freebsd.org> wrote:
>
>
> On Thu, Feb 20, 2020 at 4:58 PM Josh Paetzel <jpaetzel at freebsd.org> wrote:
>
>>
>>
>> On Wed, Feb 19, 2020, at 7:17 AM, Andriy Gapon wrote:
>> > On 18/02/2020 16:09, Andriy Gapon wrote:
>> > > My general experience with post-iflib vmxnet3 is that vmxnet3 has some
>> > > peculiarities that result in a certain "impedance mismatch" with
>> iflib.
>> > > Although we now have a bit less code and it is a bit more regular,
>> there are a
>> > > few significant (for us, at least) problems:
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243126
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240608
>> >
>> > By the way, we (Panzura) use these changes to fix or work around the
>> above two
>> > problems: https://people.freebsd.org/~avg/iflib-vmx.pz.diff
>> >
>> > Questions / comments are welcome.
>> > Especially from people who worked on iflib.
>> >
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243392
>> > > - the problem described above
>> > > - a couple of issues that we already fixed or worked around
>> > >
>> > > We are contemplating locally reverting to the pre-iflib vmxnet3 and
>> we are
>> > > wondering if the conversion was really worth it in general.
>> >
>> >
>> > --
>> > Andriy Gapon
>> > _______________________________________________
>> > freebsd-net at freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>> >
>>
>> I'd like to follow this up just to make it 100% clear. The problem is a
>> ~4x regression in RX performance. It affects stock FreeBSD, including
>> 12.1-RELEASE.
>>
>> In my 40Gbps connected lab single thread iperf receive went from 9Gbps to
>> 2.5Gbps.
>>
>> If this can't be fixed or looked at I'd heavily suggest looking at
>> reverting "iflib"ing change in stock FreeBSD.
>>
>>
> Consider these datapoints I collected this evening:
>
> Hypervisor: ESXi 6.7.0 Build 8169922
> Hardware: Xeon E5-1650 v3 @ 3.50GHz (6 physical cores, HT disabled)
>
> iperf3 client: a VM on the same vswitch as the VM under test, running
> Ubuntu 18.04.3 LTS with 2 vCPUs, 4GB RAM, and a VMXNET3 interface used only
> for traffic to the VM under test (this VMXNET3 has checksum offload,
> TSO/GSO, and LRO/GRO enabled)
> iperf3 server: running on the VM under test, either a 12.0-RELEASE VM
> (this is before the vmx iflib conversion), or a 12.1-RELEASE VM (this is
> after the vmx iflib conversion) with r356703 applied (the recent TSO bug
> fix). Both VMs have 3 vCPUs, but the vmx interface only uses 1 tx and 1 rx
> queue, as hw.pci.honor_msi_blacklist is at its default of 0, so MSI is used.
>
>
> Test 1: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 44664 connected to <12.0 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 1.11 GBytes 9.52 Gbits/sec 1144 529 KBytes
> [ 4] 1.00-2.00 sec 1.09 GBytes 9.40 Gbits/sec 1272 369 KBytes
> [ 4] 2.00-3.00 sec 1.11 GBytes 9.51 Gbits/sec 1249 344 KBytes
> [ 4] 3.00-4.00 sec 1.06 GBytes 9.12 Gbits/sec 1973 369 KBytes
> [ 4] 4.00-5.00 sec 1.11 GBytes 9.50 Gbits/sec 1860 370 KBytes
> [ 4] 5.00-6.00 sec 1.08 GBytes 9.28 Gbits/sec 1342 396 KBytes
> [ 4] 6.00-7.00 sec 1.09 GBytes 9.38 Gbits/sec 1278 563 KBytes
> [ 4] 7.00-8.00 sec 1.05 GBytes 8.99 Gbits/sec 1226 372 KBytes
> [ 4] 8.00-9.00 sec 1.03 GBytes 8.87 Gbits/sec 1145 400 KBytes
> [ 4] 9.00-10.00 sec 1.08 GBytes 9.28 Gbits/sec 1317 354 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 10.8 GBytes 9.28 Gbits/sec 13806
> sender
> [ 4] 0.00-10.00 sec 10.8 GBytes 9.28 Gbits/sec
> receiver
>
>
> Test 2: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 44714 connected to <12.0 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 3.48 GBytes 29.9 Gbits/sec 0 887 KBytes
> [ 4] 1.00-2.00 sec 1.93 GBytes 16.6 Gbits/sec 0 994 KBytes
> [ 4] 2.00-3.00 sec 2.03 GBytes 17.5 Gbits/sec 0 1.10 MBytes
> [ 4] 3.00-4.00 sec 1.99 GBytes 17.1 Gbits/sec 0 1.10 MBytes
> [ 4] 4.00-5.00 sec 2.00 GBytes 17.1 Gbits/sec 0 1.10 MBytes
> [ 4] 5.00-6.00 sec 1.93 GBytes 16.6 Gbits/sec 0 1.10 MBytes
> [ 4] 6.00-7.00 sec 2.04 GBytes 17.5 Gbits/sec 0 1.10 MBytes
> [ 4] 7.00-8.00 sec 2.01 GBytes 17.3 Gbits/sec 0 1.10 MBytes
> [ 4] 8.00-9.00 sec 1.97 GBytes 16.9 Gbits/sec 0 1.10 MBytes
> [ 4] 9.00-10.00 sec 1.98 GBytes 17.0 Gbits/sec 0 1.10 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 21.4 GBytes 18.3 Gbits/sec 0
> sender
> [ 4] 0.00-10.00 sec 21.4 GBytes 18.3 Gbits/sec
> receiver
>
>
> Test 3: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO disabled (LRO disabled and test run after Test 2 above)
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 44718 connected to <12.0 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 1.14 GBytes 9.76 Gbits/sec 1871 338 KBytes
> [ 4] 1.00-2.00 sec 483 MBytes 4.05 Gbits/sec 1307 1.41 KBytes
> [ 4] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes
> [ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes
> [ 4] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes
> [ 4] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes
> [ 4] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes
> [ 4] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes
> [ 4] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes
> [ 4] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 1.61 GBytes 1.38 Gbits/sec 3181
> sender
> [ 4] 0.00-10.00 sec 1.60 GBytes 1.38 Gbits/sec
> receiver
>
>
> Test 4: 12.0-RELEASE, single TCP stream transmit, standard mtu, TSO
> enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -R -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> Reverse mode, remote host <12.0 VM IP> is sending
> [ 4] local <Ubuntu VM IP> port 44726 connected to <12.0 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth
> [ 4] 0.00-1.00 sec 4.28 GBytes 36.8 Gbits/sec
> [ 4] 1.00-2.00 sec 3.31 GBytes 28.4 Gbits/sec
> [ 4] 2.00-3.00 sec 3.85 GBytes 33.1 Gbits/sec
> [ 4] 3.00-4.00 sec 4.24 GBytes 36.5 Gbits/sec
> [ 4] 4.00-5.00 sec 3.16 GBytes 27.1 Gbits/sec
> [ 4] 5.00-6.00 sec 3.54 GBytes 30.4 Gbits/sec
> [ 4] 6.00-7.00 sec 4.03 GBytes 34.6 Gbits/sec
> [ 4] 7.00-8.00 sec 2.93 GBytes 25.1 Gbits/sec
> [ 4] 8.00-9.00 sec 3.42 GBytes 29.4 Gbits/sec
> [ 4] 9.00-10.00 sec 3.93 GBytes 33.8 Gbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 36.7 GBytes 31.5 Gbits/sec 280
> sender
> [ 4] 0.00-10.00 sec 36.7 GBytes 31.5 Gbits/sec
> receiver
>
>
> Test 5: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 48392 connected to <12.1 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 828 MBytes 6.95 Gbits/sec 1247 335 KBytes
> [ 4] 1.00-2.00 sec 901 MBytes 7.56 Gbits/sec 1841 345 KBytes
> [ 4] 2.00-3.00 sec 909 MBytes 7.62 Gbits/sec 1805 356 KBytes
> [ 4] 3.00-4.00 sec 909 MBytes 7.62 Gbits/sec 2337 322 KBytes
> [ 4] 4.00-5.00 sec 907 MBytes 7.61 Gbits/sec 1834 354 KBytes
> [ 4] 5.00-6.00 sec 907 MBytes 7.61 Gbits/sec 1984 352 KBytes
> [ 4] 6.00-7.00 sec 909 MBytes 7.62 Gbits/sec 2189 329 KBytes
> [ 4] 7.00-8.00 sec 908 MBytes 7.62 Gbits/sec 2000 338 KBytes
> [ 4] 8.00-9.00 sec 907 MBytes 7.61 Gbits/sec 2006 315 KBytes
> [ 4] 9.00-10.00 sec 908 MBytes 7.61 Gbits/sec 1764 332 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 8.78 GBytes 7.54 Gbits/sec 19007
> sender
> [ 4] 0.00-10.00 sec 8.78 GBytes 7.54 Gbits/sec
> receiver
>
>
> Test 6: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO disabled, sysctl dev.vmx.0.iflib.tx_abdicate=1
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 48416 connected to <12.1 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 1.29 GBytes 11.1 Gbits/sec 3016 290 KBytes
> [ 4] 1.00-2.00 sec 1.33 GBytes 11.4 Gbits/sec 4133 322 KBytes
> [ 4] 2.00-3.00 sec 1.34 GBytes 11.5 Gbits/sec 5409 335 KBytes
> [ 4] 3.00-4.00 sec 1.35 GBytes 11.6 Gbits/sec 3899 376 KBytes
> [ 4] 4.00-5.00 sec 1.35 GBytes 11.6 Gbits/sec 4609 300 KBytes
> [ 4] 5.00-6.00 sec 1.35 GBytes 11.6 Gbits/sec 4603 303 KBytes
> [ 4] 6.00-7.00 sec 1.36 GBytes 11.7 Gbits/sec 4417 293 KBytes
> [ 4] 7.00-8.00 sec 1.34 GBytes 11.5 Gbits/sec 5680 290 KBytes
> [ 4] 8.00-9.00 sec 1.33 GBytes 11.5 Gbits/sec 5461 359 KBytes
> [ 4] 9.00-10.00 sec 1.03 GBytes 8.86 Gbits/sec 5060 329 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 13.1 GBytes 11.2 Gbits/sec 46287
> sender
> [ 4] 0.00-10.00 sec 13.1 GBytes 11.2 Gbits/sec
> receiver
>
>
> Test 7: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [ 4] local <Ubuntu VM IP> port 48396 connected to <12.1 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 98.5 MBytes 826 Mbits/sec 129 2.83 KBytes
> [ 4] 1.00-2.00 sec 63.6 KBytes 521 Kbits/sec 25 2.83 KBytes
> [ 4] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 25 2.83 KBytes
> [ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 16 2.83 KBytes
> [ 4] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 15 2.83 KBytes
> [ 4] 5.00-6.00 sec 63.6 KBytes 521 Kbits/sec 15 2.83 KBytes
> [ 4] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 15 2.83 KBytes
> [ 4] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 12 2.83 KBytes
> [ 4] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 15 2.83 KBytes
> [ 4] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 11 1.41 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 98.7 MBytes 82.8 Mbits/sec 278
> sender
> [ 4] 0.00-10.00 sec 97.8 MBytes 82.0 Mbits/sec
> receiver
>
>
> Test 8: 12.1-RELEASE with r356703 applied, single stream transmit,
> standard mtu, TSO enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -R -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> Reverse mode, remote host <12.1 VM IP> is sending
> [ 4] local <Ubuntu VM IP> port 48400 connected to <12.1 VM IP> port 1234
> [ ID] Interval Transfer Bandwidth
> [ 4] 0.00-1.00 sec 4.25 GBytes 36.5 Gbits/sec
> [ 4] 1.00-2.00 sec 3.29 GBytes 28.3 Gbits/sec
> [ 4] 2.00-3.00 sec 3.61 GBytes 31.0 Gbits/sec
> [ 4] 3.00-4.00 sec 3.93 GBytes 33.8 Gbits/sec
> [ 4] 4.00-5.00 sec 4.17 GBytes 35.8 Gbits/sec
> [ 4] 5.00-6.00 sec 3.53 GBytes 30.3 Gbits/sec
> [ 4] 6.00-7.00 sec 3.22 GBytes 27.7 Gbits/sec
> [ 4] 7.00-8.00 sec 3.90 GBytes 33.5 Gbits/sec
> [ 4] 8.00-9.00 sec 2.80 GBytes 24.1 Gbits/sec
> [ 4] 9.00-10.00 sec 2.78 GBytes 23.9 Gbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 35.5 GBytes 30.5 Gbits/sec 571
> sender
> [ 4] 0.00-10.00 sec 35.5 GBytes 30.5 Gbits/sec
> receiver
>
>
> Based on the above, it looks like:
>
> (1) The non-LRO single-stream TCP receive performance of the iflib vmx
> driver in 12.1 release lags behind the non-LRO single-stream TCP receive
> performance of the pre-iflib vmx driver in 12.0 (by about 20%, 7.54 Gbps
> [Test 5] vs 9.28 Gbps [Test 1]), unless tx_abdicate is enabled, in which
> case the vmx driver performs better (by about 20%, 11.2 Gbps [Test 6] vs
> 9.28 Gbps [Test 1]).
>
> (2) The TSO-enabled single-stream TCP send performance of the iflib vmx
> driver in 12.1 release (with TSO bug patch applied) is at parity with the
> pre-iflib vmx driver in 12.0 (30.5 Gbps [Test 8] and 31.5 Gbps [Test 4]).
>
> (3) There are LRO-related bugs in both the pre-iflib vmx driver in 12.0
> (see Test 3) and the iflib vmx driver in 12.1 (see Test 7), they just
> surface differently.
>
> The categories of root causes for bugs and performance issues are: bugs in
> the vmx driver, bugs in iflib, and behavioral variations across the many
> fielded versions of the VMXNET3 virtual device. Indeed, all of these
> categories have been encountered in the past year. Also, there is a rich
> set of driver configuration and operating environment parameters, which
> makes advancing the overall robustness of the driver (instead of just
> shifting issues into or out of one's own operating parameter space) an
> arduous task.
>
> I think the right way to approach this is to continue to fill out the test
> matrix and root cause and resolve all of the issues encountered, rather
> than argue for reverting to the old driver out of frustration based on a
> narrow set of (so far, rather poorly characterized) circumstances. I'm in
> a position to do this, from the standpoint of substantial knowledge of the
> vmx driver and virtual device, as well as of iflib internals, and I will be
> doing this, as non-work cycles become available.
>
>
I spent a bit of time poking at this, and I believe I have root caused all
of the reported issues and developed patches (to both iflib and the vmx
driver) that solve them. My test system running 12.1 with these patches
applied (as well as the TSO patch) operates correctly with and without TSO
and/or LRO enabled, and with large MTU values. It exhibits throughput
performance parity or better compared to the pre-iflib driver for the
single-core / single-stream tests that I am currently using to assess
correctness.
The primary issue (that resulted in the reported free-list related
assertion failures, use-after-free panics, trouble related to jumbo frames,
and trouble with LRO) was that both the vmx driver and iflib needed to be
fixed in order to correctly handle the case where the vmx virtual device
skips descriptors. It's not known why the virtual device sometimes skips
descriptors, but this seems to occur frequently, at least under ESXi, when
packets span multiple descriptors.
A secondary issue was fixed (secondary in that it impacts performance but
not correctness) in which the vmx driver was only ever using cluster-sized
receive buffers regardless of the MTU, instead of switching to page-sized
buffers when the MTU is sufficiently large.
There remains an open question as to whether the vmx virtual device
consumes a buffer descriptor or not when the completion descriptor
indicates zero length. So far I haven't been able to cause zero-length
completions to occur.
There also remains a concept fail in iflib concerning the refill of receive
descriptor rings that can be worked around, to a point, with a sysctl, but
that at some point needs to be fixed properly. iflib limits the number of
received packets it will process during a receive interrupt according to a
budget value, and then it also limits the number of receive descriptors it
will refill according to that same budget value (with a magic constant
added to it). Generally, packets can span multiple descriptors, and
limiting the refill to essentially the number of packets processed
completely fails to address this multiplicity, resulting in terrible
performance degradation when multi-segment packets are in heavy use (e.g.,
with LRO or large MTUs).
It will take a bit more time to write up all the associated details, post
the patches for review, and update the bugs. I think avg@ will recognize
in those details the completion of a number of thoughts that he had while
trying to debug this.
I also think the TSO patch, as well as the correctness fixes noted above,
should at some point wind up in an errata release for 12.1.
-Patrick
More information about the freebsd-net
mailing list