Unstable local network throughput
Ben RUBSON
ben.rubson at gmail.com
Mon Aug 8 13:52:20 UTC 2016
> On 05 Aug 2016, at 10:30, Hans Petter Selasky <hps at selasky.org> wrote:
>
> On 08/04/16 23:49, Ben RUBSON wrote:
>>>
>>> On 04 Aug 2016, at 20:15, Ryan Stone <rysto32 at gmail.com> wrote:
>>>
>>> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON <ben.rubson at gmail.com> wrote:
>>> But even without RSS, I should be able to go up to 2x40Gbps, don't you think so ?
>>> Nobody already did this ?
>>>
>>> Try this patch
>>> (...)
>>
>> I also just tested the NODEBUG kernel but it did not help.
>
> Hi,
>
> When running these tests, do you see any CPUs fully utilised?
No, CPUs look like this on both servers :
27 processes: 1 running, 26 sleeping
CPU 0: 1.1% user, 0.0% nice, 16.7% system, 0.0% interrupt, 82.2% idle
CPU 1: 1.1% user, 0.0% nice, 18.9% system, 0.0% interrupt, 80.0% idle
CPU 2: 1.9% user, 0.0% nice, 17.8% system, 0.0% interrupt, 80.4% idle
CPU 3: 1.1% user, 0.0% nice, 15.2% system, 0.0% interrupt, 83.7% idle
CPU 4: 0.4% user, 0.0% nice, 16.3% system, 0.0% interrupt, 83.3% idle
CPU 5: 1.1% user, 0.0% nice, 14.4% system, 0.0% interrupt, 84.4% idle
CPU 6: 2.6% user, 0.0% nice, 17.4% system, 0.0% interrupt, 80.0% idle
CPU 7: 2.2% user, 0.0% nice, 15.2% system, 0.0% interrupt, 82.6% idle
CPU 8: 1.1% user, 0.0% nice, 3.0% system, 15.9% interrupt, 80.0% idle
CPU 9: 0.0% user, 0.0% nice, 3.0% system, 32.2% interrupt, 64.8% idle
CPU 10: 0.0% user, 0.0% nice, 0.4% system, 58.9% interrupt, 40.7% idle
CPU 11: 0.0% user, 0.0% nice, 0.4% system, 77.4% interrupt, 22.2% idle
CPU 12: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 13: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 14: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 15: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 16: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 17: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 18: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 19: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 20: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 21: 0.0% user, 0.0% nice, 0.0% system, 0.4% interrupt, 99.6% idle
CPU 22: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU 23: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Load is correctly spread over the NUMA connected to the NIC (the first 12 CPUs).
There is clearly enough power to fulfill the full-duplex link !
I tried many cpuset configurations (IRQs over the 12 CPUs etc...), but no improvement at all.
> Did you check the RX/TX pauseframes settings and the mlx4 sysctl statistics counters, if there is packet loss?
I tried to disable RX/TX pauseframes, but it did not help.
And "sysctl -a | grep mlx | grep err" counters are all 0.
I also played with ring size, adaptive interrupt moderation... with no luck.
Ben
More information about the freebsd-net
mailing list