RE: Very slow scp performance comparing to Linux

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 29 Aug 2023 10:27:00 UTC
Wei Hu <weh_at_microsoft.com> wrote on
Date: Tue, 29 Aug 2023 07:07:39 UTC :

> Sorry for the top posting. But I don't want to make it look too messy. Here is the
> Information that I have missed in my original email.
> 
> All VMs are running on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU).
> 
> FreeBSD VMs are 16 vcpu with 128 GB memory, in non-debug build:
> 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #7 nodbg-n264692-59e706ffee52-dirty... /usr/obj/usr/src/main/amd64.amd64/sys/GENERIC-NODEBUG amd64
> 
> Ubuntu VMs are 4 vcpu with 32 GB memory, kernel version:
> 6.2.0-1009-azure #9~22.04.3-Ubuntu SMP Tue Aug 1 20:51:07 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> 
> I did a couple more tests as suggested by others in this thread. In recap:
> 
> Scp to localhost, FreeBSD (ufs) vs Ubuntu (ext4): 70 MB/s vs 550 MB/s
> Scp to localhost, FreeBSD (tmpfs) vs Ubuntu (tmpfs): 630 MB/s vs 660 MB/s
> 
> Iperf3 single stream to localhost: FreeBSD vs Ubuntu: 30.9 Gb/s vs 48.8 Gb/s
> 
> Would these numbers suggest that
> 1. ext4 caches a lot more than ufs?
> 2. there is a tcp performance gap in the network stack between FreeBSD and Ubuntu?
> 
> Would you also try run scp on ufs on your bare metal arm host? I am curious to now how different between ufs and zfs.


For this round I'm rebooting between the unxz and the 1st scp.
So I'll also have zfs results again. I'll also do a 2nd scp
(no reboot) to see if it gets notably different results.

. . .

Well, I just got FreeBSD main [so: 15] running under
HyperV on the Windows Dev Kit 2023. So reporting for
there first. This was via an ssh session. The context
is ZFS. The VM file size is fixed, as is the RAM size.
6 cores (of 8) and 24576 MiBytes assigned (of 32
GiBytes) to the one FreeBSD instance. The VM file is
on the internal NVMe drive in the Windows 11 Pro file
system in the default place.

(I was having it copy the hardrive media to the VM file
when I started this process. Modern HyperV no longer
seems to support direct use of USB3 physical media. I
first had to produce a copy of the material on smaller
media so that a fixed VM file size from a copy to
create the VM file would fit in the NVMe's free space.)

# uname -apKU
FreeBSD CA78C-WDK23s-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT 2023     root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64 aarch64 1500000 1500000

(The ZFS content is a copy of the USB3 interfaced
ZFS Optane media's content previously reported on.
So the installed system was built with -mcpu= based
optimization, as noted before.)

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 193.6MB/s   00:26

# rm ~/FreeBSD-14-TEST.img
# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 198.0MB/s   00:25


So, faster than what you are reporting for the
Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU)
context.

For reference:

# gpart show -pl
=>       40  468862055    da0  GPT  (224G)
         40      32728         - free -  (16M)
      32768     102400  da0p1  wdk23sCA78Cefi  (50M)
     135168  421703680  da0p2  wdk23sCA78Czfs  (201G)
  421838848   47022080  da0p3  wdk23sCA78Cswp22  (22G)
  468860928       1167         - free -  (584K)

# zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zwdk23s   200G  79.8G   120G        -         -     0%    39%  1.00x    ONLINE  -

(UFS would have notably more allocated and less free
for the same size partition.)



The below is be based on the HoneyComb (16 cortex-a72's)
since I've got the HyperV context going on the Windows
Dev Kit 2023 at the moment.


UFS first:

# uname -apKU
FreeBSD HC-CA72-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT 2023     root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1500000 1500000

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 129.7MB/s   00:39

# rm ~/FreeBSD-14-TEST.img
# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 130.9MB/s   00:39


So, faster than what you are reporting for the
Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU)
context.

Note: This is via a U.2 Optane 960 GB media and an M.2 adapter
instead of being via a PCIe Optane 960 GB media in the PCIe
slot.


ZFS second:

# uname -apKU
FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT 2023     root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1500000 1500000

# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
. . .
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 121.1MB/s   00:42

# rm ~/FreeBSD-14-TEST.img
# scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
(root@localhost) Password for root@CA72-16Gp-ZFS:
FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                            100% 5120MB 124.6MB/s   00:41


So, faster than what you are reporting for the
Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU)
context.

Note: This is via a PCIe Optane 960 GB media in the
PCIe slot.


UFS was slightly faster then ZFS for the HoneyComb
context but there is the M.2 vs. PCIe difference
as well.


===
Mark Millard
marklmi at yahoo.com