Re: Very slow scp performance comparing to Linux

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 28 Aug 2023 16:15:59 UTC
On Aug 28, 2023, at 08:43, Mark Millard <marklmi@yahoo.com> wrote:

> Wei Hu <weh_at_microsoft.com> wrote on
> Date: Mon, 28 Aug 2023 07:32:35 UTC :
> 
>> When I was testing a new NIC, I found the single stream scp performance was almost 8 time slower than Linux on the RX side. Initially I thought it might be something with the NIC. But when I switched to sending the file on localhost, the numbers stay the same. 
>> 
>> Here I was sending a 2GB file from sender to receiver using scp. FreeBSD is a recent NON-DEBUG build from CURRENT. The Ubuntu Linux kernel is 6.2.0. Both run in HyperV VMs on the same type of hardware. The FreeBSD VM has 16 vcpus, while Ubuntu VM has 4 vcpu.
>> 
>> Sender Receiver throughput
>> Linux FreeBSD 70 MB/s
>> Linux Linux 550 MB/s
>> FreeBSD FreeBSD 70 MB/s
>> FreeBSD Linux 350 MB/s
>> FreeBSD localhost 70 MB/s
>> Linux localhost 550 MB/s
>> 
>> From theses test, it seems I can rule out the issue on NIC and its driver. Looks the FreeBSD kernel network stack is much slower than Linux on single stream TCP, or there are some problem with scp?
>> 
>> I also tried turning on following kernel parameters on FreeBSD kernel. But it makes no difference, neither do the other tcp cc algorithms such as htcp and newreno.
>> 
>> net.inet.tcp.soreceive_stream="1"
>> net.isr.maxthreads="-1"
>> net.isr.bindthreads="1"
>> 
>> net.inet.ip.intr_queue_maxlen=2048
>> net.inet.tcp.recvbuf_max=16777216
>> net.inet.tcp.recvspace=419430
>> net.inet.tcp.sendbuf_max=16777216
>> net.inet.tcp.sendspace=209715
>> kern.ipc.maxsockbuf=16777216
>> 
>> Any ideas?
> 
> 
> You do not give explicit commands to try. Nor do you specify your
> hardware context that is involved, just that HyperV is involved.
> 
> So, on a HoneyComb (16 cortex-A72's) with Optane boot media in
> its PCIe slot I, no HyperV or VM involved, tried:

I should have listed the non-debug build in use:

# uname -apKU
FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT 2023     root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1500000 1500000

> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
> . . .
> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 120.2MB/s   00:42
> 
> It is not a high performance system. 64 GiBytes of RAM.
> 
> So instead trying a ThreadRipper 1950X that also has Optane in a
> CPIe slot for its boot media, no HyperV or VM involved,

I should have listed the non-debug build in use:

# uname -apKU
FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #116 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:20 PDT 2023     root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000

(Same source tree content.)

> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
> . . .
> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 299.7MB/s   00:17
> 
> (These systems do not run with any tmpfs areas, not even /tmp . So
> I'm not providing that kind of example, at least for now.)
> 
> 128 GiBytes of RAM.
> 
> Both systems are ZFS based but with a simple single partition.
> (Used for bectl BE not for other types of reasons to use ZFS.
> I could boot UFS variants of the boot media and test that
> kind of context.)
> 
> So both show between your FreeBSD figure and the Linux figure.
> I've no means of checking how reasonable the figures are relative
> to your test context. I just know the results are better than
> you report for localhost use.

Adding a Windows Dev Kit 2023 booting via USB3 (but via a
U.2 adapter to Optane media), again ZFS, again no VM involved:

# uname -apKU
FreeBSD CA78C-WDK23-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT 2023     root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64 aarch64 1500000 1500000

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 168.7MB/s   00:30


Note: the cortex-a72 and cortex-a78c/x1c builds were optimized via
-mcpu= use. The ThreadRipper build was not.


Note: I've not controlled for if the reads of the input *.img data
were gotten from memory caching of prior activity or not. I could
do so if you want: reboot before scp command.

===
Mark Millard
marklmi at yahoo.com