From nobody Mon Oct 31 06:02:22 2022 X-Original-To: freebsd-virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4N12cX0dGPz4gn4p for ; Mon, 31 Oct 2022 06:02:36 +0000 (UTC) (envelope-from pprocacci@gmail.com) Received: from mail-oa1-x33.google.com (mail-oa1-x33.google.com [IPv6:2001:4860:4864:20::33]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4N12cW1H6sz3sHb for ; Mon, 31 Oct 2022 06:02:35 +0000 (UTC) (envelope-from pprocacci@gmail.com) Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-13bd19c3b68so12484304fac.7 for ; Sun, 30 Oct 2022 23:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=N9KEUvNgvZhlCMm4pRzAQl1SRgA8s2UCp7gdIH/ZzEI=; b=jA6Rv6T1s9HqdaBZstUCFGWHpq7RsV7C2ULLo32CNpsTIeHKbVZbccibpAgMmd4oI6 Eq62ObpwnnIJpr6gTVdP0mKtDpOipi6g0+wCl/AbfeOWYac3GcP4VzEL/VGXZS0B92G3 01WKlEn5tOIuyM4++TKXt87+kPyhiSsSg9SGumtTp1m1VPSSTbIh+4HGloChnkm3t73E TK1Xg8h8iXJTu0Px38CiRABnxArRF9GkvucueRPV6b0K//CJ5J7c+SUOq4G7m3C2eaxZ x7UmJtW4Gnf8PPIuf0aqeJIxxryHsY/qJxv+t3xTjseK5h72XOi5dbQYSHNrzujQazPV 0oxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=N9KEUvNgvZhlCMm4pRzAQl1SRgA8s2UCp7gdIH/ZzEI=; b=RvDHSabzBU9MFyVlLiSWV1sA9Nytu5MIA3h/c97re1widL/itQHWEBmOHfhcf8ZzDY UN1z42yPwWf9HTmtvVYofqIb/9bF2LLxc+EDzeniCyVoiSN+z4x9GFgLMWZm3ZaDUgPO 75ccrVWNFODqF6oemIRU/PoO4Os70I/EtWi7CjpzGiwY5PmhLMCq4F2x2QyNHNUTOMyp oIt5g8apm6gYzJ1a5i+NNL4Vttyg5xIJa15XNJ4dFMHJ/h0I8U+xVaCGxtHy57VsrPpt PkwhZhsMD6ynEOfUfLJkTy2nhSR47szyxw3340dxDvu71JvhKpXiz8Yn5Vd9fQQeVoq9 QUPg== X-Gm-Message-State: ACrzQf3AhXuQu5KCbnawdawE3RTnlbIaLVVK6+5nYFkAcqjTjKCr2/vj hykZni4x0yFaHR6tnPua1ygWLNa0XfrGs/nv2ek5ssfso7PO X-Google-Smtp-Source: AMsMyM4mpCzUHKbhZDuHwsB5m/CHvXUYRzJ0Kr7cIYGcdQxXsZvFS/wMIA5qbVkkQAbJmEc4wqLb3x2aQ6RpZDdiUZE= X-Received: by 2002:a05:6870:c8a2:b0:136:5491:8f08 with SMTP id er34-20020a056870c8a200b0013654918f08mr15888224oab.225.1667196153893; Sun, 30 Oct 2022 23:02:33 -0700 (PDT) List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Paul Procacci Date: Mon, 31 Oct 2022 02:02:22 -0400 Message-ID: Subject: Re: NFS in bhyve VM mounted via bridge interface To: John Doherty Cc: FreeBSD virtualization Content-Type: multipart/alternative; boundary="00000000000086179b05ec4e5ad2" X-Rspamd-Queue-Id: 4N12cW1H6sz3sHb X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=jA6Rv6T1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of pprocacci@gmail.com designates 2001:4860:4864:20::33 as permitted sender) smtp.mailfrom=pprocacci@gmail.com X-Spamd-Result: default: False [-3.96 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; NEURAL_HAM_LONG(-0.96)[-0.961]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2001:4860:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ARC_NA(0.00)[]; MLMMJ_DEST(0.00)[freebsd-virtualization@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2001:4860:4864::/48, country:US]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; FROM_HAS_DN(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2001:4860:4864:20::33:from]; PREVIOUSLY_DELIVERED(0.00)[freebsd-virtualization@freebsd.org]; TO_DN_ALL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-ThisMailContainsUnwantedMimeParts: N --00000000000086179b05ec4e5ad2 Content-Type: text/plain; charset="UTF-8" On Mon, Oct 31, 2022 at 12:00 AM John Doherty wrote: > I have a machine running FreeBSD 12.3-RELEASE with a zpool that consists > of 12 mirrored pairs of 14 TB disks. I'll call this the "storage > server." On that machine, I can write to ZFS file systems at around 950 > MB/s and read from them at around 1450 MB/s. I'm happy with that. > > I have another machine running Alma linux 8.6 that mounts file systems > from the storage server via NFS over a 10 GbE network. On this machine, > I can write to and read from an NFS file system at around 450 MB/s. I > wish that this were better but it's OK. > > I created a bhyve VM on the storage server that also runs Alma linux > 8.6. It has a vNIC that is bridged with the 10 GbE physical NIC and a > tap interface: > > [root@ss3] # ifconfig vm-storage > vm-storage: flags=8843 metric 0 > mtu 1500 > ether 82:d3:46:17:4e:ee > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: tap1 flags=143 > ifmaxaddr 0 port 10 priority 128 path cost 2000000 > member: ixl0 flags=143 > ifmaxaddr 0 port 5 priority 128 path cost 2000 > groups: bridge vm-switch viid-ddece@ > nd6 options=1 > > I mount file systems from the storage server on this VM via NFS. I can > write to those file systems at around 250 MB/s and read from them at > around 280 MB/s. This surprised me a little: I thought that this might > perform better than or at least as well as the physical 10 GbE network > but find that it performs significantly worse. > > All my read and write tests here are stupidly simple, using dd to read > from /dev/zero and write to a file or to read from a file and write to > /dev/null. > > Is anyone else either surprised or unsurprised by these results? > > I have not yet tried passing a physical interface on the storage server > through to the VM with PCI passthrough, but the machine does have > another 10 GbE interface I could use for this. This stuff is all about > 3,200 miles away from me so I need to get someone to plug a cable in for > me. I'll be interested to see how that works out, though. > > Any comments much appreciated. Thanks. > > > I was getting geared up to help you with this and then this happened: Host: # dd if=17-04-27.mp4 of=/dev/null bs=4096 216616+1 records in 216616+1 records out 887263074 bytes transferred in 76.830892 secs (11548259 bytes/sec) VM: dd if=17-04-27.mp4 of=/dev/null bs=4096 216616+1 records in 216616+1 records out 887263074 bytes transferred in 7.430017 secs (119416016 bytes/sec) I'm totally flabbergasted. These results are consistent and not at all what I expected to see. I even ran the tests on the VM first and the host second. Call me confused. Anyways, that's a problem for me to figure out. Back to your problem, I had something typed out concerning checking rxsum's and txsum's are turned off on the interfaces, or at least see if that makes a difference, trying to use a disk type of nvme, and trying ng_bridge w/ netgraph interfaces but now I'm concluding my house is made of glass -- Hah! -- so until I get my house in order I'm going to refrain from providing details. Sorry and thanks! ~Paul --00000000000086179b05ec4e5ad2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Mon, Oct 31, 2= 022 at 12:00 AM John Doherty <bsdli= sts@jld3.net> wrote:
I have a machine running FreeBSD 12.3-RELEASE with a zpool that= consists
of 12 mirrored pairs of 14 TB disks.=C2=A0 I'll call this the "sto= rage
server." On that machine, I can write to ZFS file systems at around 95= 0
MB/s and read from them at around 1450 MB/s. I'm happy with that.

I have another machine running Alma linux 8.6 that mounts file systems
from the storage server via NFS over a 10 GbE network. On this machine, I can write to and read from an NFS file system at around 450 MB/s. I
wish that this were better but it's OK.

I created a bhyve VM on the storage server that also runs Alma linux
8.6. It has a vNIC that is bridged with the 10 GbE physical NIC and a
tap interface:

[root@ss3] # ifconfig vm-storage
vm-storage: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metr= ic 0
mtu 1500
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ether 82:d3:46:17:4e:ee
=C2=A0 =C2=A0 =C2=A0 =C2=A0 id 00:00:00:00:00:00 priority 32768 hellotime 2= fwddelay 15
=C2=A0 =C2=A0 =C2=A0 =C2=A0 maxage 20 holdcnt 6 proto rstp maxaddr 2000 tim= eout 1200
=C2=A0 =C2=A0 =C2=A0 =C2=A0 root id 00:00:00:00:00:00 priority 32768 ifcost= 0 port 0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 member: tap1 flags=3D143<LEARNING,DISCOVER,A= UTOEDGE,AUTOPTP>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ifmaxaddr 0 port 10= priority 128 path cost 2000000
=C2=A0 =C2=A0 =C2=A0 =C2=A0 member: ixl0 flags=3D143<LEARNING,DISCOVER,A= UTOEDGE,AUTOPTP>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ifmaxaddr 0 port 5 = priority 128 path cost 2000
=C2=A0 =C2=A0 =C2=A0 =C2=A0 groups: bridge vm-switch viid-ddece@
=C2=A0 =C2=A0 =C2=A0 =C2=A0 nd6 options=3D1<PERFORMNUD>

I mount file systems from the storage server on this VM via NFS. I can
write to those file systems at around 250 MB/s and read from them at
around 280 MB/s. This surprised me a little: I thought that this might
perform better than or at least as well as the physical 10 GbE network
but find that it performs significantly worse.

All my read and write tests here are stupidly simple, using dd to read
from /dev/zero and write to a file or to read from a file and write to
/dev/null.

Is anyone else either surprised or unsurprised by these results?

I have not yet tried passing a physical interface on the storage server through to the VM with PCI passthrough, but the machine does have
another 10 GbE interface I could use for this. This stuff is all about
3,200 miles away from me so I need to get someone to plug a cable in for me. I'll be interested to see how that works out, though.

Any comments much appreciated. Thanks.



I was getting geared up to help you with this = and then this happened:

Host:
# dd if=3D17-04-27.mp4 o= f=3D/dev/null bs=3D4096
216616+1 records in
216616+1 records out
8= 87263074 bytes transferred in 76.830892 secs (11548259 bytes/sec)

VM:
dd if=3D17-04-27.mp4 of=3D/dev/null bs=3D4096
216= 616+1 records in
216616+1 records out
887263074 bytes tran= sferred in 7.430017 secs (119416016 bytes/sec)

I'm to= tally flabbergasted.=C2=A0 These results are consistent and not at all what= I expected to see.
I even ran the tests on the VM first and = the host second.=C2=A0 Call me confused.

Anyways, that= 9;s a problem for me to figure out.

Back to your problem,= I had something typed out concerning checking rxsum's and txsum's = are turned off on
the interfaces, or at least see if that makes a differ= ence, trying to use a disk type of nvme, and trying ng_bridge
w/ netgrap= h interfaces but now I'm concluding my house is made of glass -- Hah! -= - so until I get my house in
order I'm going to refrain from providi= ng details.

Sorry and thanks!
~Paul
--00000000000086179b05ec4e5ad2--