RE: Very slow scp performance comparing to Linux
- In reply to: Mark Millard : "Re: Very slow scp performance comparing to Linux"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 29 Aug 2023 12:55:35 UTC
Hi Mark, Thanks for the update. Seems the numbers are the same on zfs and ufs. That's good to know. Yes, your numbers on ARM64 are better than mine on Intel. However, my original intention was to find out why scp on Linux is performing much better than FreeBSD under the same hardware env. Is it possible to try Linux in your ARM64 setting? I am using Ubuntu 22.04 on ext4 file system. Thanks, Wei > -----Original Message----- > From: Mark Millard <marklmi@yahoo.com> > Sent: Tuesday, August 29, 2023 7:22 PM > To: Wei Hu <weh@microsoft.com> > Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> > Subject: Re: Very slow scp performance comparing to Linux > > [Adding USB3/U.2 Optane UFS Windows Dev Kit 2023 scp examples, no VM's > involved.] > > On Aug 29, 2023, at 03:27, Mark Millard <marklmi@yahoo.com> wrote: > > > Wei Hu <weh_at_microsoft.com> wrote on > > Date: Tue, 29 Aug 2023 07:07:39 UTC : > > > >> Sorry for the top posting. But I don't want to make it look too > >> messy. Here is the Information that I have missed in my original email. > >> > >> All VMs are running on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz > K8-class CPU). > >> > >> FreeBSD VMs are 16 vcpu with 128 GB memory, in non-debug build: > >> 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #7 > >> nodbg-n264692-59e706ffee52-dirty... > >> /usr/obj/usr/src/main/amd64.amd64/sys/GENERIC-NODEBUG amd64 > >> > >> Ubuntu VMs are 4 vcpu with 32 GB memory, kernel version: > >> 6.2.0-1009-azure #9~22.04.3-Ubuntu SMP Tue Aug 1 20:51:07 UTC 2023 > >> x86_64 x86_64 x86_64 GNU/Linux > >> > >> I did a couple more tests as suggested by others in this thread. In recap: > >> > >> Scp to localhost, FreeBSD (ufs) vs Ubuntu (ext4): 70 MB/s vs 550 MB/s > >> Scp to localhost, FreeBSD (tmpfs) vs Ubuntu (tmpfs): 630 MB/s vs 660 > >> MB/s > >> > >> Iperf3 single stream to localhost: FreeBSD vs Ubuntu: 30.9 Gb/s vs > >> 48.8 Gb/s > >> > >> Would these numbers suggest that > >> 1. ext4 caches a lot more than ufs? > >> 2. there is a tcp performance gap in the network stack between FreeBSD > and Ubuntu? > >> > >> Would you also try run scp on ufs on your bare metal arm host? I am > curious to now how different between ufs and zfs. > > > > > > For this round I'm rebooting between the unxz and the 1st scp. > > So I'll also have zfs results again. I'll also do a 2nd scp (no > > reboot) to see if it gets notably different results. > > > > . . . > > > > Well, I just got FreeBSD main [so: 15] running under HyperV on the > > Windows Dev Kit 2023. So reporting for there first. This was via an > > ssh session. The context is ZFS. The VM file size is fixed, as is the > > RAM size. > > 6 cores (of 8) and 24576 MiBytes assigned (of 32 > > GiBytes) to the one FreeBSD instance. The VM file is on the internal > > NVMe drive in the Windows 11 Pro file system in the default place. > > > > (I was having it copy the hardrive media to the VM file when I started > > this process. Modern HyperV no longer seems to support direct use of > > USB3 physical media. I first had to produce a copy of the material on > > smaller media so that a fixed VM file size from a copy to create the > > VM file would fit in the NVMe's free space.) > > > > # uname -apKU > > FreeBSD CA78C-WDK23s-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT > aarch64 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 > 09:20:31 PDT 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main- > CA78C-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG- > CA78C arm64 aarch64 1500000 1500000 > > > > (The ZFS content is a copy of the USB3 interfaced ZFS Optane media's > > content previously reported on. > > So the installed system was built with -mcpu= based optimization, as > > noted before.) > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 193.6MB/s 00:26 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 198.0MB/s 00:25 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > For reference: > > > > # gpart show -pl > > => 40 468862055 da0 GPT (224G) > > 40 32728 - free - (16M) > > 32768 102400 da0p1 wdk23sCA78Cefi (50M) > > 135168 421703680 da0p2 wdk23sCA78Czfs (201G) > > 421838848 47022080 da0p3 wdk23sCA78Cswp22 (22G) > > 468860928 1167 - free - (584K) > > > > # zpool list > > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP > HEALTH ALTROOT > > zwdk23s 200G 79.8G 120G - - 0% 39% 1.00x ONLINE - > > > > (UFS would have notably more allocated and less free for the same size > > partition.) > > > > > > > > The below is be based on the HoneyComb (16 cortex-a72's) since I've > > got the HyperV context going on the Windows Dev Kit 2023 at the > > moment. > > > > > > UFS first: > > > > # uname -apKU > > FreeBSD HC-CA72-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT > 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 > aarch64 1500000 1500000 > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 129.7MB/s 00:39 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 130.9MB/s 00:39 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > Note: This is via a U.2 Optane 960 GB media and an M.2 adapter instead > > of being via a PCIe Optane 960 GB media in the PCIe slot. > > > > > > ZFS second: > > > > # uname -apKU > > FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT > 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 > aarch64 1500000 1500000 > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 121.1MB/s 00:42 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > (root@localhost) Password for root@CA72-16Gp-ZFS: > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 124.6MB/s 00:41 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > Note: This is via a PCIe Optane 960 GB media in the PCIe slot. > > > > > > UFS was slightly faster then ZFS for the HoneyComb context but there > > is the M.2 vs. PCIe difference as well. > > > > # uname -apKU > FreeBSD CA78C-WDK23-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT > 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64 > aarch64 1500000 1500000 > > Again, a -mcpu= optimized build context for the FreeBSD in > operation. > > (Still rebooting first. Then . . .) > > # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818- > 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img > . . . > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 199.3MB/s 00:25 > > # rm ~/FreeBSD-14-TEST.img > # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818- > 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img > . . . > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img 100% 5120MB > 204.9MB/s 00:24 > > > So, faster than what you are reporting for the > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) > context. > > The Windows Dev Kit 2023 figures are generally faster than the > HoneyComb figures. > > === > Mark Millard > marklmi at yahoo.com