Re: Chelsio Forwarding performance and RELENG_13 vs RELENG_12

Reply: mike tancsa : "Re: Chelsio Forwarding performance and RELENG_13 vs RELENG_12"
In reply to: mike tancsa : "Re: Chelsio Forwarding performance and RELENG_13 vs RELENG_12"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Navdeep Parhar <nparhar_at_gmail.com>
Date: Fri, 21 Oct 2022 18:13:14 UTC

On 10/21/22 10:57 AM, mike tancsa wrote:
> On 10/21/2022 1:31 PM, Navdeep Parhar wrote:
>> On 10/18/22 12:16 PM, mike tancsa wrote:
>>> I updated a RELENG_12 router along with the hardware to RELENG_13 
>>> (oct 14th kernel) and was surprised to see an increase in 
>>> dev.cxl.0.stats.rx_ovflow0 at a somewhat faster rate than I was 
>>> seeing on the older slightly slower hardware under about the same 
>>> load. (Xeon(R) E-2226G CPU @ 3.40GHz) vs a 4 core Xeon same freq, 
>>> same memory speed. About 150Kpps in and out and a 1Gb/s throughput
>>>
>>> loader.conf is the same
>>>
>>>
>>> hw.cxgbe.toecaps_allowed="0"
>>> hw.cxgbe.rdmacaps_allowed="0"
>>> hw.cxgbe.iscsicaps_allowed="0"
>>> hw.cxgbe.fcoecaps_allowed="0"
>>> hw.cxgbe.pause_settings="0"
>>> hw.cxgbe.attack_filter="1"
>>> hw.cxgbe.drop_pkts_with_l3_errors="1"
>>>
>>> As there is a large routing table, I do have
>>>
>>> [fib_algo] inet.0 (radix4_lockless#46) rebuild_fd_flm: switching algo 
>>> to radix4
>>> [fib_algo] inet6.0 (radix6_lockless#58) rebuild_fd_flm: switching 
>>> algo to radix6
>>>
>>> kicking in.
>>>
>>> and sysctl.conf
>>>
>>> net.route.multipath=0
>>>
>>> net.inet.ip.redirect=0
>>> net.inet6.ip6.redirect=0
>>> kern.ipc.maxsockbuf=16777216
>>> net.inet.tcp.blackhole=1
>>>
>>> Are there any other tweaks that can be done in order to better 
>>> forwarding performance ? I do see at bootup time
>>>
>>> cxl0: nrxq (6), hw RSS table size (128); expect uneven traffic 
>>> distribution.
>>> cxl1: nrxq (6), hw RSS table size (128); expect uneven traffic 
>>> distribution.
>>> cxl3: nrxq (6), hw RSS table size (128); expect uneven traffic 
>>> distribution.
>>>
>>> The cpu is 6 core. No HT enabled
>>
>> The old system was 4-core so it must have used 4 queues.  Can you 
>> please try that on the new system and see how it does?
>>
>> hw.cxgbe.ntxq=4
>> hw.cxgbe.nrxq=4
>>
> Thanks Navdeep!
> 
> Unfortunately, still the odd dropped packet :(

Can you try increasing the size of the queues?

hw.cxgbe.qsize_txq=2048
hw.cxgbe.qsize_rxq=2048

The stats show that you are using MTU 1500.  If you were using MTU 9000 
I'd also have suggested setting largest_rx_cluster to 4K.

hw.cxgbe.largest_rx_cluster=4096

Regards,
Navdeep