[Bug 274994] Regression of iperf3 network throughput tests with erms "rep movsb" copyto loops

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 09 Nov 2023 19:25:47 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274994

--- Comment #2 from Mateusz Guzik <mjg@FreeBSD.org> ---
I can run FreeBSD in a vm on  Intel(R) Xeon(R) Platinum 8470N which is also
Sapphire Rapids.

I get:
[  5]   0.00-10.00  sec  71.0 GBytes  61.0 Gbits/sec    3             sender
[  5]   0.00-10.00  sec  71.0 GBytes  61.0 Gbits/sec                  receiver

Which is the as Linux on the same machine.

Poor man's profiling with dtrace: dtrace -w -n 'profile:::profile-4999 {
@[sym(arg0)] = count(); } tick-10s { system("clear"); trunc(@, 40);
printa("%40a %@16d\n", @); clear(@); }'

... shows that while copyin is indeed high up in terms of CPU usage, the
singular most time-consuming thing is lock contention.

[snip]
                         kernel`mb_dupcl             2172
                      kernel`tcp_m_copym             2279
                          kernel`m_getm2             2683
               kernel`tcp_default_output             3364
                      kernel`mb_free_ext             3443
                    kernel`spinlock_exit             3487
                   kernel`tcp_do_segment             4243
                kernel`soreceive_generic             5127
                kernel`copyout_smap_erms             9601
                 kernel`copyin_smap_erms             9744
                       kernel`lock_delay            12566
                      kernel`acpi_cpu_c1           719581

-- 
You are receiving this mail because:
You are the assignee for the bug.