[Bug 221919] ixl: TX queue hang when using TSO and having a high and mixed network load
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Sat Dec 15 14:19:05 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221919
--- Comment #20 from Peter Eriksson <peter.x.eriksson at liu.se> ---
Just a quick note that we're still seeing the same problem on our production
servers if we enable "tso" on the 10G interfaces. FreeBSD 11.2-RELEASE-p6.
Haven't been able to reproduce it on the test servers (identical hardware)
running 11.2-RELEASE-p5 (and 12-0-RELEASE) so far though (but they don't see
any traffic)...
Driver version:
> dev.ixl.0.%desc: Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.9.9-k
Firmware:
> dev.ixl.0.fw_version: fw 6.80.48812 api 1.7 nvm 6.00 etid 80003751 oem 18.4608.17
Watch Events in the output from sysctl -a
> dev.ixl.0.watchdog_events: 4
Dmesg errors:
> ixl0: WARNING: queue 3 appears to be hung!
> ixl0: WARNING: queue 2 appears to be hung!
> ixl2: WARNING: queue 2 appears to be hung!
> ixl2: WARNING: queue 4 appears to be hung!
> ixl2: WARNING: queue 7 appears to be hung!
> ixl2: WARNING: queue 3 appears to be hung!
> ixl0: WARNING: queue 7 appears to be hung!
> ixl2: WARNING: queue 3 appears to be hung!
> ixl0: WARNING: queue 4 appears to be hung!
(Output from ifconfig with TSO disabled)
> # ifconfig lagg0
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > options=6404bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> ether 3c:fd:fe:25:47:a0
> inet6 fe80::3efd:feff:fe25:47a0%lagg0 prefixlen 64 scopeid 0xa
> inet6 2001:6b0:17:2400::8:43 prefixlen 64
> inet 130.236.8.43 netmask 0xffffffe0 broadcast 130.236.8.63
> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> media: Ethernet autoselect
> status: active
> groups: lagg
> laggproto lacp lagghash l2,l3,l4
> laggport: ixl0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
> laggport: ixl2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
iperf3 output with TSO disabled:
> # iperf3 -c filur00 -t4
> Connecting to host filur00, port 5201
> [ 5] local 2001:6b0:17:2400::8:43 port 51226 connected to 2001:6b0:17:2400::8:40 port 5201
> [ ID] Interval Transfer Bitrate Retr Cwnd
> [ 5] 0.00-1.00 sec 318 MBytes 2.66 Gbits/sec 0 561 KBytes
> [ 5] 1.00-2.00 sec 350 MBytes 2.94 Gbits/sec 0 1.11 MBytes
> [ 5] 2.00-3.00 sec 392 MBytes 3.28 Gbits/sec 0 1.67 MBytes
> [ 5] 3.00-4.00 sec 351 MBytes 2.94 Gbits/sec 0 1.77 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 5] 0.00-4.00 sec 1.38 GBytes 2.95 Gbits/sec 0 sender
> [ 5] 0.00-4.00 sec 1.38 GBytes 2.95 Gbits/sec receiver
>
> iperf Done.
With TSO enabled (when things work):
> # ifconfig lagg0 tso ; iperf3 -c filur00 -t4
> Connecting to host filur00, port 5201
> [ 5] local 2001:6b0:17:2400::8:43 port 51237 connected to 2001:6b0:17:2400::8:40 port 5201
> [ ID] Interval Transfer Bitrate Retr Cwnd
> [ 5] 0.00-1.00 sec 976 MBytes 8.19 Gbits/sec 0 492 KBytes
> [ 5] 1.00-2.00 sec 1.08 GBytes 9.29 Gbits/sec 0 1021 KBytes
> [ 5] 2.00-3.00 sec 1.08 GBytes 9.29 Gbits/sec 0 1.50 MBytes
> [ 5] 3.00-4.00 sec 1.08 GBytes 9.28 Gbits/sec 0 1.75 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 5] 0.00-4.00 sec 4.20 GBytes 9.01 Gbits/sec 0 sender
> [ 5] 0.00-4.00 sec 4.19 GBytes 9.01 Gbits/sec receiver
>
> iperf Done.
But often queues get stuck and freezes. Hmm.. I just noticed that it was IPv6
that stopped working when I tried to enable it on a production server and ran
iperf3 on it - IPv4 traffic was still passing thru.
Can it be that there still are IPv6 (TSO6)-related bugs and that the IPv4 ones
are solved? Too bad I can't find a way to force it to happen on the test
servers...
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the freebsd-net
mailing list