Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)
Date: Mon, 13 Jun 2022 18:25:36 UTC
Hello, I have two new servers with a Mellnox ConnectX-6 card linked at 25Gb/s, however, I am unable to get much more than 6Gb/s when testing with iperf3. The servers are Lenovo SR665 (2 x AMD EPYC 7443 24-Core Processor, 256 GB RAM, Mellanox ConnectX-6 Lx 10/25GbE SFP28 2-port OCP Ethernet Adapter) They are connected to a Dell N3224PX-ON switch. Both servers are idle and not in use, with a fresh install of stable/13-ebea872f8, nothing running on them except ssh, sendmail, etc. When i test with iperf3 I am unable to get a higher avg than about 6Gb/s. I have tried just about every knob listed in https://calomel.org/freebsd_network_tuning.html with little impact on the performance. The network cards have HW LRO enabled as per the driver documentation (though this only seems to lower IRQ usage with no impact on actual throughput). The same exact servers tested on Linux (fedora 34) produced nearly 3x the performance (see attached screenshots), i was able to get a steady 14.6Gb/s rate with nearly 0 retries shown in iperf, the performance on FreeBSD seems to avg at around 6Gbs but it is very sporadic during the iperf run. I have run out of ideas, any suggestions are welcome. Considering Netflix uses very similar HW and they push 400 Gb/s tells me there is something really wrong here or Netflix isnt sharing all their secret sauce. # ifconfig mce0 mce0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=ffed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,TXRTLMT,HWRXTSTMP,NOMAP,TXTLS4,TXTLS6,VXLAN_HWCSUM,VXLAN_HWTSO,TXTLS_RTLMT> ether b8:ce:f6:81:df:6a inet 192.168.10.31 netmask 0xffffff00 broadcast 192.168.10.255 media: Ethernet 25GBase-CR <full-duplex,rxpause,txpause> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 Connecting to host db-01, port 5201 [ 5] local 192.168.10.31 port 64695 connected to 192.168.10.30 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 930 MBytes 7.80 Gbits/sec 62 789 KBytes [ 5] 1.00-2.00 sec 942 MBytes 7.90 Gbits/sec 164 824 KBytes [ 5] 2.00-3.00 sec 1.00 GBytes 8.61 Gbits/sec 402 879 KBytes [ 5] 3.00-4.00 sec 761 MBytes 6.39 Gbits/sec 61 588 KBytes [ 5] 4.00-5.00 sec 724 MBytes 6.07 Gbits/sec 220 497 KBytes [ 5] 5.00-6.00 sec 723 MBytes 6.07 Gbits/sec 54 364 KBytes [ 5] 6.00-7.00 sec 716 MBytes 6.01 Gbits/sec 187 682 KBytes [ 5] 7.00-8.00 sec 728 MBytes 6.11 Gbits/sec 86 568 KBytes [ 5] 8.00-9.00 sec 761 MBytes 6.39 Gbits/sec 37 418 KBytes [ 5] 9.00-10.00 sec 733 MBytes 6.15 Gbits/sec 8 617 KBytes [ 5] 10.00-11.00 sec 734 MBytes 6.16 Gbits/sec 238 474 KBytes [ 5] 11.00-12.00 sec 736 MBytes 6.17 Gbits/sec 164 757 KBytes [ 5] 12.00-13.00 sec 610 MBytes 5.12 Gbits/sec 118 579 KBytes [ 5] 13.00-14.00 sec 1.02 GBytes 8.75 Gbits/sec 447 449 KBytes [ 5] 14.00-15.00 sec 728 MBytes 6.11 Gbits/sec 132 719 KBytes [ 5] 15.00-16.00 sec 724 MBytes 6.07 Gbits/sec 185 649 KBytes [ 5] 16.00-17.00 sec 597 MBytes 5.01 Gbits/sec 142 570 KBytes [ 5] 17.00-18.00 sec 733 MBytes 6.15 Gbits/sec 102 484 KBytes [ 5] 18.00-19.00 sec 726 MBytes 6.09 Gbits/sec 15 569 KBytes [ 5] 19.00-20.00 sec 733 MBytes 6.15 Gbits/sec 181 527 KBytes [ 5] 20.00-21.00 sec 729 MBytes 6.12 Gbits/sec 118 430 KBytes [ 5] 21.00-22.00 sec 733 MBytes 6.15 Gbits/sec 116 641 KBytes [ 5] 22.00-23.00 sec 728 MBytes 6.10 Gbits/sec 182 598 KBytes [ 5] 23.00-24.00 sec 743 MBytes 6.24 Gbits/sec 209 614 KBytes [ 5] 24.00-25.00 sec 746 MBytes 6.26 Gbits/sec 72 758 KBytes [ 5] 25.00-26.00 sec 742 MBytes 6.23 Gbits/sec 199 675 KBytes [ 5] 26.00-27.00 sec 799 MBytes 6.70 Gbits/sec 183 542 KBytes [ 5] 27.00-28.00 sec 908 MBytes 7.61 Gbits/sec 7 1.19 MBytes [ 5] 28.00-29.00 sec 1.37 GBytes 11.7 Gbits/sec 606 1013 KBytes [ 5] 29.00-30.00 sec 1.31 GBytes 11.3 Gbits/sec 74 1.02 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 23.7 GBytes 6.79 Gbits/sec 4771 sender [ 5] 0.00-30.00 sec 23.7 GBytes 6.79 Gbits/sec receiver I have even tried changing to the RACK TCP stack, only to get slightly better results, however with RACK the amount of retries is nearly 0. [root@db-02 ~]# sysctl net.inet.tcp.functions_default=rack net.inet.tcp.functions_default: rack -> rack [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 Connecting to host db-01, port 5201 [ 5] local 192.168.10.31 port 51894 connected to 192.168.10.30 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 761 MBytes 6.38 Gbits/sec 0 737 KBytes [ 5] 1.00-2.00 sec 859 MBytes 7.21 Gbits/sec 0 761 KBytes [ 5] 2.00-3.00 sec 880 MBytes 7.38 Gbits/sec 0 785 KBytes [ 5] 3.00-4.00 sec 734 MBytes 6.16 Gbits/sec 0 804 KBytes [ 5] 4.00-5.00 sec 777 MBytes 6.52 Gbits/sec 0 824 KBytes [ 5] 5.00-6.00 sec 719 MBytes 6.03 Gbits/sec 0 841 KBytes [ 5] 6.00-7.00 sec 865 MBytes 7.26 Gbits/sec 0 862 KBytes [ 5] 7.00-8.00 sec 880 MBytes 7.38 Gbits/sec 0 882 KBytes [ 5] 8.00-9.00 sec 906 MBytes 7.60 Gbits/sec 0 904 KBytes [ 5] 9.00-10.00 sec 749 MBytes 6.29 Gbits/sec 0 921 KBytes [ 5] 10.00-11.00 sec 798 MBytes 6.69 Gbits/sec 0 938 KBytes [ 5] 11.00-12.00 sec 746 MBytes 6.26 Gbits/sec 209 772 KBytes [ 5] 12.00-13.00 sec 768 MBytes 6.44 Gbits/sec 35 644 KBytes [ 5] 13.00-14.00 sec 948 MBytes 7.95 Gbits/sec 0 673 KBytes [ 5] 14.00-15.00 sec 1.23 GBytes 10.5 Gbits/sec 0 711 KBytes [ 5] 15.00-16.00 sec 1.32 GBytes 11.4 Gbits/sec 0 748 KBytes [ 5] 16.00-17.00 sec 1.31 GBytes 11.2 Gbits/sec 0 785 KBytes [ 5] 17.00-18.00 sec 1.29 GBytes 11.1 Gbits/sec 0 819 KBytes [ 5] 18.00-19.00 sec 1.30 GBytes 11.2 Gbits/sec 0 852 KBytes [ 5] 19.00-20.00 sec 1.34 GBytes 11.5 Gbits/sec 0 883 KBytes [ 5] 20.00-21.00 sec 1.29 GBytes 11.1 Gbits/sec 0 914 KBytes [ 5] 21.00-22.00 sec 1.36 GBytes 11.7 Gbits/sec 0 944 KBytes [ 5] 22.00-23.00 sec 1.33 GBytes 11.4 Gbits/sec 0 974 KBytes [ 5] 23.00-24.00 sec 1.31 GBytes 11.2 Gbits/sec 0 1003 KBytes [ 5] 24.00-25.00 sec 1.30 GBytes 11.2 Gbits/sec 0 1.00 MBytes [ 5] 25.00-26.00 sec 1.34 GBytes 11.5 Gbits/sec 0 1.03 MBytes [ 5] 26.00-27.00 sec 1.32 GBytes 11.3 Gbits/sec 0 1.06 MBytes [ 5] 27.00-28.00 sec 957 MBytes 8.03 Gbits/sec 0 1.07 MBytes [ 5] 28.00-29.00 sec 837 MBytes 7.02 Gbits/sec 0 1.09 MBytes [ 5] 29.00-30.00 sec 729 MBytes 6.11 Gbits/sec 0 1.10 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 30.6 GBytes 8.77 Gbits/sec 244 sender [ 5] 0.00-30.00 sec 30.6 GBytes 8.77 Gbits/sec receiver More data can be found @ https://forums.freebsd.org/threads/poor-performance-with-stable-13-and-mellanox-connectx-6-mlx5.85460/ Mike Jakubik https://www.swiftsmsgateway.com/ Disclaimer: This e-mail and any attachments are intended only for the use of the addressee(s) and may contain information that is privileged or confidential. If you are not the intended recipient, or responsible for delivering the information to the intended recipient, you are hereby notified that any dissemination, distribution, printing or copying of this e-mail and any attachments is strictly prohibited. If this e-mail and any attachments were received in error, please notify the sender by reply e-mail and delete the original message.