Issue with BGP router / high interrupt / Chelsio / FreeBSD 12.1
BulkMailForRudy
crapsh at monkeybrains.net
Fri Feb 14 22:58:10 UTC 2020
On 2/14/20 4:21 AM, Andrey V. Elsukov wrote:
> On 13.02.2020 06:21, Rudy wrote:
>>
>> I'm having issues with a box that is acting as a BGP router for my
>> network. 3 Chelsio cards, two T5 and one T6. It was working great
>> until I turned up our first port on the T6. It seems like traffic
>> passing in from a T5 card and out the T6 causes a really high load (and
>> high interrupts).
>>
>> Traffic (not that much, right?)
>>
>> Dev RX bps TX bps RX PPS TX PPS Error
>> cc0 0 0 0 0 0
>> cc1 2212 M 7 M 250 k 6 k 0 (100Gbps uplink,
>> filtering inbound routes to keep TX low)
>> cxl0 287 k 2015 M 353 244 k 0 (our network)
>> cxl1 940 M 3115 M 176 k 360 k 0 (our network)
>> cxl2 634 M 1014 M 103 k 128 k 0 (our network)
>> cxl3 1 k 16 M 1 4 k 0
>> cxl4 0 0 0 0 0
>> cxl5 0 0 0 0 0
>> cxl6 2343 M 791 M 275 k 137 k 0 (IX , part of lagg0)
>> cxl7 1675 M 762 M 215 k 133 k 0 (IX , part of lagg0)
>> ixl0 913 k 18 M 0 0 0
>> ixl1 1 M 30 M 0 0 0
>> lagg0 4019 M 1554 M 491 k 271 k 0
>> lagg1 1 M 48 M 0 0 0
>> FreeBSD 12.1-STABLE orange 976 Bytes/Packetavg
>> 1:42PM up 13:25, 5 users, load averages: 9.38, 10.43, 9.827
> Hi,
>
> did you try to use pmcstat to determine what is the heaviest task for
> your system?
>
> # kldload hwpmc
> # pmcstat -S inst_retired.any -Tw1
PMC: [inst_retired.any] Samples: 168557 (100.0%) , 2575 unresolved
Key: q => exiting...
%SAMP IMAGE FUNCTION CALLERS
16.6 kernel sched_idletd fork_exit
14.7 kernel cpu_search_highest cpu_search_highest:12.4
sched_switch:1.4 sched_idletd:0.9
10.5 kernel cpu_search_lowest cpu_search_lowest:9.6
sched_pickcpu:0.9
4.2 kernel eth_tx drain_ring
3.4 kernel rn_match fib4_lookup_nh_basic
2.4 kernel lock_delay __mtx_lock_sleep
1.9 kernel mac_ifnet_check_tran ether_output
>
> Then capture several first lines from the output and quit using 'q'.
>
> Do you use some firewall? Also, can you show the snapshot from the `top
> -HPSIzts1` output.
last pid: 28863; load averages: 9.30, 10.33,
10.56 up 0+14:16:08 14:53:23
817 threads: 25 running, 586 sleeping, 206 waiting
CPU 0: 0.8% user, 0.0% nice, 6.2% system, 0.0% interrupt, 93.0% idle
CPU 1: 2.4% user, 0.0% nice, 0.0% system, 7.9% interrupt, 89.8% idle
CPU 2: 0.0% user, 0.0% nice, 0.8% system, 7.1% interrupt, 92.1% idle
CPU 3: 1.6% user, 0.0% nice, 0.0% system, 10.2% interrupt, 88.2% idle
CPU 4: 0.0% user, 0.0% nice, 0.0% system, 9.4% interrupt, 90.6% idle
CPU 5: 0.8% user, 0.0% nice, 0.8% system, 20.5% interrupt, 78.0% idle
CPU 6: 1.6% user, 0.0% nice, 0.0% system, 5.5% interrupt, 92.9% idle
CPU 7: 0.0% user, 0.0% nice, 0.0% system, 3.1% interrupt, 96.9% idle
CPU 8: 0.8% user, 0.0% nice, 0.8% system, 7.1% interrupt, 91.3% idle
CPU 9: 0.0% user, 0.0% nice, 0.8% system, 9.4% interrupt, 89.8% idle
CPU 10: 0.0% user, 0.0% nice, 0.0% system, 35.4% interrupt, 64.6% idle
CPU 11: 0.0% user, 0.0% nice, 0.0% system, 36.2% interrupt, 63.8% idle
CPU 12: 0.0% user, 0.0% nice, 0.0% system, 38.6% interrupt, 61.4% idle
CPU 13: 0.0% user, 0.0% nice, 0.0% system, 49.6% interrupt, 50.4% idle
CPU 14: 0.0% user, 0.0% nice, 0.0% system, 46.5% interrupt, 53.5% idle
CPU 15: 0.0% user, 0.0% nice, 0.0% system, 32.3% interrupt, 67.7% idle
CPU 16: 0.0% user, 0.0% nice, 0.0% system, 46.5% interrupt, 53.5% idle
CPU 17: 0.0% user, 0.0% nice, 0.0% system, 56.7% interrupt, 43.3% idle
CPU 18: 0.0% user, 0.0% nice, 0.0% system, 31.5% interrupt, 68.5% idle
CPU 19: 0.0% user, 0.0% nice, 0.8% system, 34.6% interrupt, 64.6% idle
Mem: 636M Active, 1159M Inact, 5578M Wired, 24G Free
ARC: 1430M Total, 327M MFU, 589M MRU, 32K Anon, 13M Header, 502M Other
268M Compressed, 672M Uncompressed, 2.51:1 Ratio
Swap: 4096M Total, 4096M Free
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
12 root -92 - 0B 3376K WAIT 13 41:13 12.86%
intr{irq358: t5nex0:2a1}
12 root -92 - 0B 3376K WAIT 12 48:08 12.77%
intr{irq347: t5nex0:1a6}
12 root -92 - 0B 3376K CPU13 13 47:40 11.96%
intr{irq348: t5nex0:1a7}
12 root -92 - 0B 3376K WAIT 17 43:46 11.38%
intr{irq342: t5nex0:1a1}
12 root -92 - 0B 3376K WAIT 14 29:17 10.70%
intr{irq369: t5nex0:2ac}
12 root -92 - 0B 3376K WAIT 11 47:55 9.85%
intr{irq428: t5nex1:2a5}
12 root -92 - 0B 3376K WAIT 16 46:11 9.22%
intr{irq351: t5nex0:1aa}
12 root -92 - 0B 3376K WAIT 19 42:28 9.04%
intr{irq344: t5nex0:1a3}
12 root -92 - 0B 3376K WAIT 16 46:45 8.82%
intr{irq341: t5nex0:1a0}
12 root -92 - 0B 3376K RUN 11 48:04 8.33%
intr{irq356: t5nex0:1af}
12 root -92 - 0B 3376K WAIT 10 46:24 8.32%
intr{irq355: t5nex0:1ae}
12 root -92 - 0B 3376K WAIT 10 42:03 8.32%
intr{irq345: t5nex0:1a4}
12 root -92 - 0B 3376K WAIT 14 36:34 8.29%
intr{irq441: t5nex1:3a2}
12 root -92 - 0B 3376K WAIT 19 46:14 8.21%
intr{irq354: t5nex0:1ad}
12 root -92 - 0B 3376K WAIT 14 47:29 8.13%
intr{irq349: t5nex0:1a8}
12 root -92 - 0B 3376K WAIT 11 40:25 7.91%
intr{irq346: t5nex0:1a5}
12 root -92 - 0B 3376K WAIT 15 49:33 7.62%
intr{irq350: t5nex0:1a9}
12 root -92 - 0B 3376K WAIT 5 45:37 7.57%
intr{irq322: t6nex0:1af}
12 root -92 - 0B 3376K WAIT 18 45:41 7.43%
intr{irq353: t5nex0:1ac}
12 root -92 - 0B 3376K WAIT 17 36:43 7.34%
intr{irq434: t5nex1:2ab}
12 root -92 - 0B 3376K WAIT 17 33:30 7.11%
intr{irq424: t5nex1:2a1}
12 root -92 - 0B 3376K WAIT 4 31:43 7.02%
intr{irq312: t6nex0:1a5}
12 root -92 - 0B 3376K WAIT 16 35:01 6.95%
intr{irq433: t5nex1:2aa}
12 root -92 - 0B 3376K WAIT 17 47:03 6.84%
intr{irq352: t5nex0:1ab}
12 root -92 - 0B 3376K WAIT 18 41:33 6.73%
intr{irq343: t5nex0:1a2}
12 root -92 - 0B 3376K WAIT 9 37:02 6.42%
intr{irq317: t6nex0:1aa}
12 root -92 - 0B 3376K WAIT 10 32:22 6.40%
intr{irq427: t5nex1:2a4}
Thanks. I did change the chelsio_affinity today to get the cards to
bind IRQs to CPU cores in the same numa-domain. Still, load seems a bit
high when using the t6 card compared to just using the T5 cards.
More information about the freebsd-net
mailing list