Re: Very slow scp performance comparing to Linux

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 28 Aug 2023 15:43:03 UTC
Wei Hu <weh_at_microsoft.com> wrote on
Date: Mon, 28 Aug 2023 07:32:35 UTC :

> When I was testing a new NIC, I found the single stream scp performance was almost 8 time slower than Linux on the RX side. Initially I thought it might be something with the NIC. But when I switched to sending the file on localhost, the numbers stay the same. 
> 
> Here I was sending a 2GB file from sender to receiver using scp. FreeBSD is a recent NON-DEBUG build from CURRENT. The Ubuntu Linux kernel is 6.2.0. Both run in HyperV VMs on the same type of hardware. The FreeBSD VM has 16 vcpus, while Ubuntu VM has 4 vcpu.
> 
> Sender Receiver throughput
> Linux FreeBSD 70 MB/s
> Linux Linux 550 MB/s
> FreeBSD FreeBSD 70 MB/s
> FreeBSD Linux 350 MB/s
> FreeBSD localhost 70 MB/s
> Linux localhost 550 MB/s
> 
> From theses test, it seems I can rule out the issue on NIC and its driver. Looks the FreeBSD kernel network stack is much slower than Linux on single stream TCP, or there are some problem with scp?
> 
> I also tried turning on following kernel parameters on FreeBSD kernel. But it makes no difference, neither do the other tcp cc algorithms such as htcp and newreno.
> 
> net.inet.tcp.soreceive_stream="1"
> net.isr.maxthreads="-1"
> net.isr.bindthreads="1"
> 
> net.inet.ip.intr_queue_maxlen=2048
> net.inet.tcp.recvbuf_max=16777216
> net.inet.tcp.recvspace=419430
> net.inet.tcp.sendbuf_max=16777216
> net.inet.tcp.sendspace=209715
> kern.ipc.maxsockbuf=16777216
> 
> Any ideas?


You do not give explicit commands to try. Nor do you specify your
hardware context that is involved, just that HyperV is involved.

So, on a HoneyComb (16 cortex-A72's) with Optane boot media in
its PCIe slot I, no HyperV or VM involved, tried:

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 120.2MB/s   00:42

It is not a high performance system. 64 GiBytes of RAM.

So instead trying a ThreadRipper 1950X that also has Optane in a
CPIe slot for its boot media, no HyperV or VM involved,

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 299.7MB/s   00:17

(These systems do not run with any tmpfs areas, not even /tmp . So
I'm not providing that kind of example, at least for now.)

128 GiBytes of RAM.

Both systems are ZFS based but with a simple single partition.
(Used for bectl BE not for other types of reasons to use ZFS.
I could boot UFS variants of the boot media and test that
kind of context.)

So both show between your FreeBSD figure and the Linux figure.
I've no means of checking how reasonable the figures are relative
to your test context. I just know the results are better than
you report for localhost use.

===
Mark Millard
marklmi at yahoo.com