Re: Performance test for CUBIC in stable/14
- Reply: Cheng Cui : "Re: Performance test for CUBIC in stable/14"
- In reply to: Cheng Cui : "Re: Performance test for CUBIC in stable/14"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 23 Oct 2024 21:43:21 UTC
On Wed, Oct 23, 2024 at 03:14:08PM -0400, Cheng Cui wrote: >I see. The result of `newreno` vs. `cubic` shows non-constant/infrequent >packet >retransmission. So TCP congestion control has little impact on improving the >performance. > >The performance bottleneck may come from somewhere else. For example, the >sender CPU shows 97.7% utilization. Would there be any way to reduce CPU >usage? There are 11 VMs running on the bhyve server. None of them are very busy but the server shows % uptime 9:54p.m. up 8 days, 6:08, 22 users, load averages: 0.82, 1.25, 1.74 The test vm vm4-fbsd14s: % uptime 9:55PM up 2 days, 3:12, 5 users, load averages: 0.35, 0.31, 0.21 It has % sysctl hw.ncpu hw.ncpu: 8 and avail memory = 66843062272 (63746 MB) so it's not short of resources. A test just now gave these results: - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-20.04 sec 1.31 GBytes 563 Mbits/sec 0 sender [ 5] 0.00-20.06 sec 1.31 GBytes 563 Mbits/sec receiver CPU Utilization: local/sender 94.1% (0.1%u/94.1%s), remote/receiver 15.5% (1.5%u/13.9%s) snd_tcp_congestion cubic rcv_tcp_congestion cubic iperf Done. so I'm not sure how the utilization figure was synthesised, unless it's derived from something like 'top' where 1.00 is 100%. Load when running the test got to 0.83 as observed in 'top' in another terminal. Five mins after the test, load in the vm is: 0.32, 0.31, 0.26 on the bhyve host: 0.39, 0.61, 1.11 Before we began testing, I was looking at the speed issue as being caused by something to do with interrupts and/or polling, and/or HZ, somehting that linux handles differently and gives better results on the same bhyve host. Maybe rebuilding the kernel with a different scheduler on both the host and the freebsd vms will give a better result for freebsd if tweaking sysctls doesn't make much of a difference. In terms of real-world bandwidth, I found that the combination of your modified cc_cubic + rack gave the best results in terms of overall throughput in a speedtest context, although it's slower to get to its max throughput than cubic alone. I'm still testing with a webdav/rsync context (cubic against cubic+rack) The next lot of testing after changing the scheduler will be on a KVM host, with various *BSDs as guests. There may be a tradeoff of stability against speed I guess. --