RE: Very slow scp performance comparing to Linux [dd to /dev/null shows substantial FreeBSD vs. Ubuntu differences for bs=1k (or 1K) and bs=512]

From: Wei Hu <weh_at_microsoft.com>
Date: Thu, 31 Aug 2023 03:59:52 UTC

> -----Original Message-----
> From: Mark Millard <marklmi@yahoo.com>
> Sent: Thursday, August 31, 2023 10:18 AM
> To: Mark Saad <nonesuch@longcount.org>
> Cc: Wei Hu <weh@microsoft.com>; FreeBSD Hackers <freebsd-
> hackers@freebsd.org>
> Subject: Re: Very slow scp performance comparing to Linux [dd to /dev/null
> shows substantial FreeBSD vs. Ubuntu differences for bs=1k (or 1K) and
> bs=512]
> 
> On Aug 30, 2023, at 18:45, Mark Saad <nonesuch@longcount.org> wrote:
> 
> > All
> >  Why not take scp out of the picture and try iperf? Why , we could be looking
> at rss by default in Linux .

Actually I did the iperf3 test as well and posted results a couple days ago. 
Pasting here:

FreeBSD iperf3 to localhost, single stream: 30.9 Gb/s 
Linux iperf3 to localhost, single stream: 48.8 Gb/s

Neither of them has any tcp retry. 

Both VMs run on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU). 
The FreeBSD VM is 16 vcpu, with 128 GB memory.
The Linux VM is 4 vcpu, with 32 GB memory.

Wei


> 
> The explorations with ???@localhost:/dev/null and with dd suggest non-
> networking issues are a significant contributor to the data rate differences
> being observed on FreeBSD vs. Ubuntu 22.04.3 --including when no
> networking is involved at all.
> 
> I also did experiments with cipher selections that I've not reported.
> 
> As I've no clue why the original note was about specifically about scp
> performance, I've just been try to gather data that might be of some use, even
> for other contexts than just scp used over an actual network.
> 
> Also having iperf figures for just the network contribution would be useful too
> --if the network had appropriate characteristics for comparison to the original
> context. The network that I'm using is limited to 1 Gbit/s and may not be a
> good match for comparison to the original context. I've just not gone that
> direction so far.
> 
> > ---
> > Mark Saad | nonesuch@longcount.org
> >
> >> On Aug 30, 2023, at 8:10 PM, Mark Millard <marklmi@yahoo.com> wrote:
> >>
> >> On Aug 30, 2023, at 01:49, Mark Millard <marklmi@yahoo.com> wrote:
> >>
> >>>> On Aug 30, 2023, at 01:22, Mark Millard <marklmi@yahoo.com> wrote:
> >>>>
> >>>>> On Aug 30, 2023, at 01:17, Mark Millard <marklmi@yahoo.com>
> wrote:
> >>>>
> >>>>> On Aug 29, 2023, at 12:52, Mark Millard <marklmi@yahoo.com>
> wrote:
> >>>>>
> >>>>>> Wei Hu <weh_at_microsoft.com> wrote on
> >>>>>> Date: Tue, 29 Aug 2023 12:55:35 UTC :
> >>>>>>
> >>>>>>> Thanks for the update. Seems the numbers are the same on zfs and
> >>>>>>> ufs. That's good to know.
> >>>>>>>
> >>>>>>> Yes, your numbers on ARM64 are better than mine on Intel.
> >>>>>>> However, my original intention was to find out why scp on Linux
> >>>>>>> is performing much better than FreeBSD under the same hardware
> env.
> >>>>>>>
> >>>>>>> Is it possible to try Linux in your ARM64 setting? I am using
> >>>>>>> Ubuntu 22.04 on ext4 file system.
> >>>>>>
> >>>>>>
> >>>>>> I tried to use the Hyper-V Quick Create on the Windows Dev Kit
> >>>>>> 2023 to install a Ubuntu 22.04 . (No clue if ext4 would result.)
> >>>>>> But the Hyper-V UEFI reports for the disk created:
> >>>>>>
> >>>>>> 1. SCSI Disk 0,0
> >>>>>> The boot loader did not load an operating system.
> >>>>>>
> >>>>>> (It then reports the network adapter attempt found no boot image,
> >>>>>> but that is expected.)
> >>>>>>
> >>>>>> That leaves me wondering if Hyper-V Quick Create established a VM
> >>>>>> file holding Intel/AMD material despite the aarch64 context.
> >>>>>>
> >>>>>> Establishing a Ubuntu more directly is not familiar and will have
> >>>>>> to be a background activity and, so, likely will not be timely.
> >>>>>> If I did any experiments outside Hyper-V (native booting), they
> >>>>>> would be with slower
> >>>>>> USB3 SSD media than I use for FreeBSD.
> >>>>>>
> >>>>>> I did notice that Hyper-V Quick Create did not create a fixed
> >>>>>> sized disk but a dynamic sized one. That is different than what I
> >>>>>> did for FreeBSD.
> >>>>>>
> >>>>>> Also, it was not obvious if you were after aarch64 Hyper-V
> >>>>>> testing vs. native-boot testing vs. both. So I may have gone the
> >>>>>> wrong direction from the start.
> >>>>>> It is possible that I'd find establishing a native-boot easier
> >>>>>> and then be able to have a VM file created from the media, more
> >>>>>> like what I did with FreeBSD.
> >>>>>>
> >>>>>> The Ubuntu activity likely would not be analogous to the FreeBSD
> >>>>>> builds having -mcpu= optimization used.
> >>>>>>
> >>>>>> Back to $work.
> >>>>>>
> >>>>>
> >>>>> I found a sequence of UI operations that worked for installing
> >>>>> Ubuntu server 22.04.3 into Hyper-V in Windows 11 Pro on the
> >>>>> Windows Dev Kit 2023 via use of a downloaded *.iso .
> >>>>>
> >>>>> The kernel that results predates 6.0:
> >>>>>
> >>>>> $ uname -ap
> >>>>> Linux ubwdk23s 5.15.0-82-generic #91-Ubuntu SMP Mon Aug 14
> >>>>> 14:19:18 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
> >>>>>
> >>>>> Using my usual rule of rebooting before the first scp:
> >>>>>
> >>>>> $ scp
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >>>>> 41.img markmi@localhost:FreeBSD-14-TEST.img
> >>>>> . . .
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 431.3MB/s   00:11
> >>>>>
> >>>>> $ rm FreeBSD-14-TEST.img
> >>>>> $ scp
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >>>>> 41.img markmi@localhost:FreeBSD-14-TEST.img
> >>>>> . . .
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 482.2MB/s   00:10
> >>>>>
> >>>>> Definitely faster than the FreeBSD results that I reported
> >>>>> earlier, including faster than the ThreadRipper 1950X with Optane
> >>>>> in a PCIe slot (more like 300 MiBytes/sec).
> >>>>>
> >>>>> I again used 6 cores, 24576 MiBytes of RAM, a fixed sized virtual
> >>>>> hard disk under Hyper-V.
> >>>>>
> >>>>> For reference:
> >>>>>
> >>>>> $ lsblk -f
> >>>>> NAME   FSTYPE   FSVER LABEL UUID                                 FSAVAIL FSUSE%
> MOUNTPOINTS
> >>>>> loop0  squashfs 4.0                                                    0   100%
> /snap/core20/1977
> >>>>> loop1  squashfs 4.0                                                    0   100% /snap/lxd/24326
> >>>>> loop2  squashfs 4.0                                                    0   100%
> /snap/snapd/19459
> >>>>> sda                                                                              ├─sda1 vfat     FAT32
> F7E9-1344                                 1G     1% /boot/efi
> >>>>> └─sda2 ext4     1.0         48a0dbe6-5a99-4b6e-92dc-fe6d8efc6ffe
> 99.3G    14% /
> >>>>>
> >>>>>
> >>>>>
> >>>>> An experiment would be to have a small amount if RAM relative the
> >>>>> file size. That would force it to actually write to media for some
> >>>>> part of the file copy.
> >>>>
> >>>> The wording was poor: "force it" here is just from the Ubuntu
> >>>> viewpoint. I make no claim to know if Hyper-V is actually writing
> >>>> the material out to media at the time vs. later.
> >>>>
> >>>>> So using 1024 MiByte of RAM assigned in Hyper-V:
> >>>>>
> >>>>> $ scp
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >>>>> 41.img markmi@localhost:FreeBSD-14-TEST.img
> >>>>> . . .
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 407.5MB/s   00:12
> >>>>>
> >>>>> $ rm FreeBSD-14-TEST.img
> >>>>> $ scp
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >>>>> 41.img markmi@localhost:FreeBSD-14-TEST.img
> >>>>> . . .
> >>>>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 404.7MB/s   00:12
> >>>>>
> >>>>> Still definitely faster than the FreeBSD results that I reported
> >>>>> earlier, including faster than the ThreadRipper 1950X with Optane
> >>>>> in a PCIe slot (more like 300 MiBytes/sec).
> >>>
> >>> One more variation in ubuntu under Hyper-V, still with 1024 MiBytes
> >>> of assigned RAM: use of localhost:/dev/null
> >>>
> >>> $ scp
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img markmi@localhost:/dev/null . . .
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> >>>
> >>> $ scp
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img markmi@localhost:/dev/null . . .
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 492.9MB/s   00:10
> >>>
> >>>
> >>> The matching FreeBSD examples with 24576 MiBytes of RAM assigned
> (ZFS context):
> >>>
> >>> # scp
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img root@localhost:/dev/null . . .
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> >>>
> >>> # scp
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img root@localhost:/dev/null . . .
> >>> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img
> 100% 5120MB 198.7MB/s   00:25
> >>>
> >>>
> >>> Note: At most one VM running at a time, never both in overlapping times.
> >>
> >> Avoiding having a cipher involved and even localhost
> >> involved: use dd . . .
> >>
> >>
> >> FreeBSD examples for Windows Dev Kit 2023 Hyper-V context,
> >> 24576 MiByts of RAM assigned):
> >>
> >> # dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=1m status=progress
> >> 2512388096 bytes (2512 MB, 2396 MiB) transferred 1.046s, 2402 MB/s
> >> 5120+0 records in
> >> 5120+0 records out
> >> 5368709120 bytes transferred in 1.627071 secs (3299614770 bytes/sec)
> >> CA78C-WDK23s-ZFS aarch64  1500000 1500000 # dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=1k status=progress
> >> 5233509376 bytes (5234 MB, 4991 MiB) transferred 14.022s, 373 MB/s
> >> 5242880+0 records in
> >> 5242880+0 records out
> >> 5368709120 bytes transferred in 14.365142 secs (373731714 bytes/sec)
> >> CA78C-WDK23s-ZFS aarch64  1500000 1500000 # dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=512 status=progress
> >> 5285410816 bytes (5285 MB, 5041 MiB) transferred 27.029s, 196 MB/s
> >> 10485760+0 records in
> >> 10485760+0 records out
> >> 5368709120 bytes transferred in 27.432570 secs (195705657 bytes/sec)
> >>
> >>
> >> Ubuntu 22.04.3 for Windows Dev Kit 2023 Hyper-V context, only 1024
> >> MiBytes of RAM assigned:
> >>
> >> $ dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=1M status=progress
> >> 4003463168 bytes (4.0 GB, 3.7 GiB) copied, 2 s, 2.0 GB/s
> >> 5120+0 records in
> >> 5120+0 records out
> >> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 2.56342 s, 2.1 GB/s $ dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=1K status=progress
> >> 4793865216 bytes (4.8 GB, 4.5 GiB) copied, 6 s, 799 MB/s
> >> 5242880+0 records in
> >> 5242880+0 records out
> >> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 6.60403 s, 813 MB/s
> >> markmi@ubwdk23s:~$ dd
> >> if=FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-2648
> >> 41.img of=/dev/null bs=512 status=progress
> >> 4800102912 bytes (4.8 GB, 4.5 GiB) copied, 9 s, 533 MB/s
> >> 10485760+0 records in
> >> 10485760+0 records out
> >> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 9.95606 s, 539 MB/s
> >
> 
> ===
> Mark Millard
> marklmi at yahoo.com