Re: Very slow scp performance comparing to Linux [dd to /dev/null shows substantial FreeBSD vs. Ubuntu differences for bs=1k (or 1K) and bs=512]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 31 Aug 2023 02:17:58 UTC
On Aug 30, 2023, at 18:45, Mark Saad <nonesuch@longcount.org> wrote:

> All
>  Why not take scp out of the picture and try iperf? Why , we could be looking at rss by default in Linux .

The explorations with ???@localhost:/dev/null and with dd suggest
non-networking issues are a significant contributor to the data
rate differences being observed on FreeBSD vs. Ubuntu 22.04.3
--including when no networking is involved at all.

I also did experiments with cipher selections that I've not
reported.

As I've no clue why the original note was about specifically
about scp performance, I've just been try to gather data that
might be of some use, even for other contexts than just scp
used over an actual network.

Also having iperf figures for just the network contribution
would be useful too --if the network had appropriate
characteristics for comparison to the original context. The
network that I'm using is limited to 1 Gbit/s and may not be
a good match for comparison to the original context. I've
just not gone that direction so far.

> ---
> Mark Saad | nonesuch@longcount.org
> 
>> On Aug 30, 2023, at 8:10 PM, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>> On Aug 30, 2023, at 01:49, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>>>> On Aug 30, 2023, at 01:22, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>>> On Aug 30, 2023, at 01:17, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>>> On Aug 29, 2023, at 12:52, Mark Millard <marklmi@yahoo.com> wrote:
>>>>> 
>>>>>> Wei Hu <weh_at_microsoft.com> wrote on
>>>>>> Date: Tue, 29 Aug 2023 12:55:35 UTC :
>>>>>> 
>>>>>>> Thanks for the update. Seems the numbers are the same on zfs and ufs. That's 
>>>>>>> good to know. 
>>>>>>> 
>>>>>>> Yes, your numbers on ARM64 are better than mine on Intel. However, my original
>>>>>>> intention was to find out why scp on Linux is performing much better than FreeBSD
>>>>>>> under the same hardware env. 
>>>>>>> 
>>>>>>> Is it possible to try Linux in your ARM64 setting? I am using Ubuntu 22.04 on ext4 
>>>>>>> file system.
>>>>>> 
>>>>>> 
>>>>>> I tried to use the Hyper-V Quick Create on the Windows Dev Kit 2023
>>>>>> to install a Ubuntu 22.04 . (No clue if ext4 would result.) But the
>>>>>> Hyper-V UEFI reports for the disk created:
>>>>>> 
>>>>>> 1. SCSI Disk 0,0
>>>>>> The boot loader did not load an operating system.
>>>>>> 
>>>>>> (It then reports the network adapter attempt found no
>>>>>> boot image, but that is expected.)
>>>>>> 
>>>>>> That leaves me wondering if Hyper-V Quick Create
>>>>>> established a VM file holding Intel/AMD material
>>>>>> despite the aarch64 context.
>>>>>> 
>>>>>> Establishing a Ubuntu more directly is not familiar and
>>>>>> will have to be a background activity and, so, likely
>>>>>> will not be timely. If I did any experiments outside
>>>>>> Hyper-V (native booting), they would be with slower
>>>>>> USB3 SSD media than I use for FreeBSD.
>>>>>> 
>>>>>> I did notice that Hyper-V Quick Create did not create
>>>>>> a fixed sized disk but a dynamic sized one. That is
>>>>>> different than what I did for FreeBSD.
>>>>>> 
>>>>>> Also, it was not obvious if you were after aarch64
>>>>>> Hyper-V testing vs. native-boot testing vs. both. So
>>>>>> I may have gone the wrong direction from the start.
>>>>>> It is possible that I'd find establishing a native-boot
>>>>>> easier and then be able to have a VM file created from
>>>>>> the media, more like what I did with FreeBSD.
>>>>>> 
>>>>>> The Ubuntu activity likely would not be analogous to
>>>>>> the FreeBSD builds having -mcpu= optimization used.
>>>>>> 
>>>>>> Back to $work.
>>>>>> 
>>>>> 
>>>>> I found a sequence of UI operations that worked for
>>>>> installing Ubuntu server 22.04.3 into Hyper-V in
>>>>> Windows 11 Pro on the Windows Dev Kit 2023 via
>>>>> use of a downloaded *.iso .
>>>>> 
>>>>> The kernel that results predates 6.0:
>>>>> 
>>>>> $ uname -ap
>>>>> Linux ubwdk23s 5.15.0-82-generic #91-Ubuntu SMP Mon Aug 14 14:19:18 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
>>>>> 
>>>>> Using my usual rule of rebooting before the first scp:
>>>>> 
>>>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:FreeBSD-14-TEST.img
>>>>> . . .
>>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 431.3MB/s   00:11 
>>>>> 
>>>>> $ rm FreeBSD-14-TEST.img
>>>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:FreeBSD-14-TEST.img
>>>>> . . .
>>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 482.2MB/s   00:10
>>>>> 
>>>>> Definitely faster than the FreeBSD results that I reported
>>>>> earlier, including faster than the ThreadRipper 1950X with
>>>>> Optane in a PCIe slot (more like 300 MiBytes/sec).
>>>>> 
>>>>> I again used 6 cores, 24576 MiBytes of RAM, a fixed sized virtual hard
>>>>> disk under Hyper-V.
>>>>> 
>>>>> For reference:
>>>>> 
>>>>> $ lsblk -f
>>>>> NAME   FSTYPE   FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
>>>>> loop0  squashfs 4.0                                                    0   100% /snap/core20/1977
>>>>> loop1  squashfs 4.0                                                    0   100% /snap/lxd/24326
>>>>> loop2  squashfs 4.0                                                    0   100% /snap/snapd/19459
>>>>> sda                                                                              ├─sda1 vfat     FAT32       F7E9-1344                                 1G     1% /boot/efi
>>>>> └─sda2 ext4     1.0         48a0dbe6-5a99-4b6e-92dc-fe6d8efc6ffe   99.3G    14% /
>>>>> 
>>>>> 
>>>>> 
>>>>> An experiment would be to have a small amount if RAM relative
>>>>> the file size. That would force it to actually write to media
>>>>> for some part of the file copy.
>>>> 
>>>> The wording was poor: "force it" here is just from the
>>>> Ubuntu viewpoint. I make no claim to know if Hyper-V
>>>> is actually writing the material out to media at the
>>>> time vs. later.
>>>> 
>>>>> So using 1024 MiByte of RAM assigned in Hyper-V:
>>>>> 
>>>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:FreeBSD-14-TEST.img
>>>>> . . .
>>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 407.5MB/s   00:12
>>>>> 
>>>>> $ rm FreeBSD-14-TEST.img
>>>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:FreeBSD-14-TEST.img
>>>>> . . .
>>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 404.7MB/s   00:12
>>>>> 
>>>>> Still definitely faster than the FreeBSD results that I
>>>>> reported earlier, including faster than the ThreadRipper
>>>>> 1950X with Optane in a PCIe slot (more like 300 MiBytes/sec).
>>> 
>>> One more variation in ubuntu under Hyper-V, still with 1024 MiBytes
>>> of assigned RAM: use of localhost:/dev/null
>>> 
>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:/dev/null
>>> . . .
>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              
>>> 
>>> $ scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img markmi@localhost:/dev/null
>>> . . .
>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 492.9MB/s   00:10
>>> 
>>> 
>>> The matching FreeBSD examples with 24576 MiBytes of RAM assigned (ZFS context):
>>> 
>>> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:/dev/null
>>> . . .
>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              
>>> 
>>> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img root@localhost:/dev/null
>>> . . .
>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img                                                                                              100% 5120MB 198.7MB/s   00:25
>>> 
>>> 
>>> Note: At most one VM running at a time, never both in overlapping times.
>> 
>> Avoiding having a cipher involved and even localhost
>> involved: use dd . . .
>> 
>> 
>> FreeBSD examples for Windows Dev Kit 2023 Hyper-V context,
>> 24576 MiByts of RAM assigned):
>> 
>> # dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=1m status=progress
>> 2512388096 bytes (2512 MB, 2396 MiB) transferred 1.046s, 2402 MB/s
>> 5120+0 records in
>> 5120+0 records out
>> 5368709120 bytes transferred in 1.627071 secs (3299614770 bytes/sec)
>> CA78C-WDK23s-ZFS aarch64  1500000 1500000 # dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=1k status=progress
>> 5233509376 bytes (5234 MB, 4991 MiB) transferred 14.022s, 373 MB/s
>> 5242880+0 records in
>> 5242880+0 records out
>> 5368709120 bytes transferred in 14.365142 secs (373731714 bytes/sec)
>> CA78C-WDK23s-ZFS aarch64  1500000 1500000 # dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=512 status=progress
>> 5285410816 bytes (5285 MB, 5041 MiB) transferred 27.029s, 196 MB/s
>> 10485760+0 records in
>> 10485760+0 records out
>> 5368709120 bytes transferred in 27.432570 secs (195705657 bytes/sec)
>> 
>> 
>> Ubuntu 22.04.3 for Windows Dev Kit 2023 Hyper-V context,
>> only 1024 MiBytes of RAM assigned:
>> 
>> $ dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=1M status=progress
>> 4003463168 bytes (4.0 GB, 3.7 GiB) copied, 2 s, 2.0 GB/s
>> 5120+0 records in
>> 5120+0 records out
>> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 2.56342 s, 2.1 GB/s
>> $ dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=1K status=progress
>> 4793865216 bytes (4.8 GB, 4.5 GiB) copied, 6 s, 799 MB/s
>> 5242880+0 records in
>> 5242880+0 records out
>> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 6.60403 s, 813 MB/s
>> markmi@ubwdk23s:~$ dd if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-264841.img of=/dev/null bs=512 status=progress
>> 4800102912 bytes (4.8 GB, 4.5 GiB) copied, 9 s, 533 MB/s
>> 10485760+0 records in
>> 10485760+0 records out
>> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 9.95606 s, 539 MB/s
> 

===
Mark Millard
marklmi at yahoo.com