From nobody Mon Jan 29 21:13:25 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TP1Hn1PKqz59QnT for ; Mon, 29 Jan 2024 21:13:41 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TP1Hl42ZKz4RSk for ; Mon, 29 Jan 2024 21:13:39 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20230601.gappssmtp.com header.s=20230601 header.b=y4sIgiwP; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2a00:1450:4864:20::52d) smtp.mailfrom=wlosh@bsdimp.com Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-55ef4a66008so1966504a12.3 for ; Mon, 29 Jan 2024 13:13:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1706562817; x=1707167617; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6towBHjMHwlxnXyJEGj9jvGcz46FlxE+2X+G+9FKE9w=; b=y4sIgiwPCfrFQeHJIf3gF6q/thgExT4Z4bKWZv5vgwWwUyIxO54Hj0lT9GcTJkiIUc /2G4XpR4EjjbCEvr3xNr6NYBODUFDA6IRw7YxKDppTw6WvqfkbrcHFZZqWe8pkAfTcLJ CACYOWKY27RaKaQwjOgEDf1J4HARv0eSKao1NKQ6YViPhDDgEuruAL8bFUNTy0gwxL+e i4N8iXZ7J0yTa/Vu1FFopDHoLl+m3KSR4nGjdoMPtffqrJWNYRos9z/opz4cBkOtmhEG oSbbmoZm7HBsmgEc77Ilf5QO96AYUAeFaKojvrJJXP0RICOFW49cJBA5zTDaTl3mCIba kRxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706562817; x=1707167617; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6towBHjMHwlxnXyJEGj9jvGcz46FlxE+2X+G+9FKE9w=; b=k+SnVbZiKRi0nA2OC2Dx4i18RFgXeAYqO+5NgXj8OnZExhXGm8tmFmXOP4oz+BSYwq sPf5USdlbpSxcb8/VbjBbuXONj5tQzfLhebebhWfSs/0rZxA6zDil7AJtjcpBqDDtXjb 0nz3xP2jCVQLYxzv6nHFzJSWqVnFAC/Pn7i1mtsR6HO1DIWNTrnADd7p+dbcIwNx/NG/ RVV6SZ4T31QFETT/pUlPDokdjkLldoj9fO8a7XCajWMThsbPUvpCdDHatqeJm4VHtRKv Np1dVcLiRWgFEuZeT6izJnKjKOX5IUlx8foVJ1SSwBnNBTSCZe47lpoc1NoCPpeUazSJ bJLA== X-Gm-Message-State: AOJu0YwlRDngM3AYOF26EnJ9Pt18+4NCixs94zazRCrYye0KNL9Asznv LASUDRUwbfq5hNR3cAGeq01tnoBI0nPSjCwiNvZ5hT9Ic9DdJsQh/exbS8UwwvcDG6dSrpUI/Ac 1fCBqVQ6d1iZhLNjEqMNwQEHa95jCfKhKEGxrfw== X-Google-Smtp-Source: AGHT+IGL2RyyRTMxuXvnqVXXND3ULacP7F5ch9GNUNl8RN/QLxiT3s2uDlGoZ4dr7H36K9iwxoJ/oxi/CWnhwWk3Naw= X-Received: by 2002:a05:6402:b66:b0:55f:354f:fb7c with SMTP id cb6-20020a0564020b6600b0055f354ffb7cmr280356edb.33.1706562816682; Mon, 29 Jan 2024 13:13:36 -0800 (PST) List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 References: <202401290514.40T5Eb1i061789@gitrepo.freebsd.org> In-Reply-To: From: Warner Losh Date: Mon, 29 Jan 2024 14:13:25 -0700 Message-ID: Subject: Re: git: 3be59adbb5a2 - main - vtnet: Adjust for ethernet alignment. To: "Bjoern A. Zeeb" Cc: Warner Losh , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Type: multipart/alternative; boundary="00000000000079bbe206101c1e42" X-Spamd-Bar: -- X-Spamd-Result: default: False [-3.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20230601.gappssmtp.com:s=20230601]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; MISSING_XM_UA(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::52d:from]; DMARC_NA(0.00)[bsdimp.com]; MLMMJ_DEST(0.00)[dev-commits-src-main@freebsd.org]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[5]; R_SPF_NA(0.00)[no SPF record]; PREVIOUSLY_DELIVERED(0.00)[dev-commits-src-main@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20230601.gappssmtp.com:+] X-Rspamd-Queue-Id: 4TP1Hl42ZKz4RSk --00000000000079bbe206101c1e42 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sorry for (a) the top post and (b) replying to myself... https://reviews.freebsd.org/D43656 is what I think you're worried about. Am I right, or is there some place else that has you uneasy? Warner P.S. I'm also thinking about following that up with https://reviews.freebsd.org/D43654 since that style seems more in fashion these days. On Mon, Jan 29, 2024 at 10:14=E2=80=AFAM Warner Losh wrote= : > > > On Mon, Jan 29, 2024 at 8:26=E2=80=AFAM Bjoern A. Zeeb < > bzeeb-lists@lists.zabbadoz.net> wrote: > >> On Mon, 29 Jan 2024, Warner Losh wrote: >> >> > The branch main has been updated by imp: >> > >> > URL: >> https://cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc82= 286e507e95 >> > >> > commit 3be59adbb5a2ae7600d46432d3bc82286e507e95 >> > Author: Warner Losh >> > AuthorDate: 2024-01-29 05:08:55 +0000 >> > Commit: Warner Losh >> > CommitDate: 2024-01-29 05:08:55 +0000 >> > >> > vtnet: Adjust for ethernet alignment. >> > >> > If the header that we add to the packet's size is 0 % 4 and we're >> > strictly aligning, then we need to adjust where we store the header >> so >> > the packet that follows will have it's struct ip header properly >> > aligned. We do this on allocation (and when we check the length of >> the >> > mbufs in the lro_nomrg case). We can't just adjust the clustersz in >> the >> > softc, because it's also used to allocate the mbufs and it needs to >> be >> > the proper size for that. Since we otherwise use the size of the mb= uf >> > (or sometimes the smaller size of the received packet) to compute h= ow >> > much we can buffer, this ensures no overflows. The 2 byte adjustmen= t >> > also does not affect how many packets we can receive in the lro_nom= rg >> > case. >> >> >> Doesn't this still include at least these two un-asserted/un-documented >> asumptions: >> >> (a) mbuf space is large enough to hold 2 extra bytes? Is this always >> the case? >> > > I was sure I puzzled through all the cases correctly. We adjust the leng= th > of the available buffer by 2 and offset everything by 2. this work becaus= e > all the vtnet header types only have 1 or 2 byte data fields. It keeps us > from > writing too much into the buffer. > > However, in vtnet_rx_cluster_size, we don't adjust the frame size before > allocating. So if the mtu + vlan_header. So if the mtu + 12 + 18 is 2047 > or 2048 > or mtu =3D 2017 or 2018 we'll get it wrong (we don't adjust in the case w= here > we use vtnet_rx_header which is 14 bytes). But I think in that case, we'l= l > "drop > the last two bytes off the end" get it wrong (since we adjust the total > length > of the mbuf space) rather than "overflow two bytes" get it wrong. For tha= t > case, we'd need to add two as I indicated in the comments below. > > static int > vtnet_rx_cluster_size(struct vtnet_softc *sc, int mtu) > { > int framesz; > > if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS) > return (MJUMPAGESIZE); > else if (sc->vtnet_flags & VTNET_FLAG_LRO_NOMRG) > return (MCLBYTES); > > /* > * Try to scale the receive mbuf cluster size from the MTU. We > * could also use the VQ size to influence the selected size, > * but that would only matter for very small queues. > */ > if (vtnet_modern(sc)) { > MPASS(sc->vtnet_hdr_size =3D=3D sizeof(struct > virtio_net_hdr_v1)); > framesz =3D sizeof(struct virtio_net_hdr_v1); > } else > framesz =3D sizeof(struct vtnet_rx_header); > framesz +=3D sizeof(struct ether_vlan_header) + mtu; > // XXX if framesz % 4 =3D=3D 2 and we're strict alignment we need to add = 2 > // XXX or equivalently, if vnet_hdr_size % 4 =3D=3D 0 and ... > if (framesz <=3D MCLBYTES) > return (MCLBYTES); > else if (framesz <=3D MJUMPAGESIZE) > return (MJUMPAGESIZE); > else if (framesz <=3D MJUM9BYTES) > return (MJUM9BYTES); > > /* Sane default; avoid 16KB clusters. */ > return (MCLBYTES); > } > > Do you agree? Is this what you are worried about? It's the only hole I > could find > this morning (after going over this a dozen other times trying to get it > right for > the review, and bryanv was happy neither noticed). It also explains why m= y > tests work: I didn't try to have a weird mtu of 2018 bytes. > > >> (b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of >> bytes? Could add comments or CTASSERTs? >> > > True, I'll ctassert the sizes and say we rely on things being even sized > in if_vnetvar.h. > > Warner > > >> > PR: 271288 >> > Sponsored by: Netflix >> > Reviewed by: bryanv >> > Differential Revision: https://reviews.freebsd.org/D43224 >> >> -- >> Bjoern A. Zeeb r15:7 >> > --00000000000079bbe206101c1e42 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Sorry=C2=A0for (a) the top post and (b) replying to myself= ...


is what I th= ink you're worried about. Am I right, or is there some place else that = has you uneasy?

Warner

P.= S. I'm also thinking about following that up with=C2=A0https://reviews.freebsd.org/D43654 since= that style seems more in fashion these days.

On Mon, Jan 29, 2024 at = 10:14=E2=80=AFAM Warner Losh <imp@bsdi= mp.com> wrote:


On Mon, Jan 29, 2024 at 8:26=E2= =80=AFAM Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> wrote:
On Mon, 29 Jan 2024, War= ner Losh wrote:

> The branch main has been updated by imp:
>
> URL: https://= cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc82286e507e95<= /a>
>
> commit 3be59adbb5a2ae7600d46432d3bc82286e507e95
> Author:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> AuthorDate: 2024-01-29 05:08:55 +0000
> Commit:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> CommitDate: 2024-01-29 05:08:55 +0000
>
>=C2=A0 =C2=A0 vtnet: Adjust for ethernet alignment.
>
>=C2=A0 =C2=A0 If the header that we add to the packet's size is 0 %= 4 and we're
>=C2=A0 =C2=A0 strictly aligning, then we need to adjust where we store = the header so
>=C2=A0 =C2=A0 the packet that follows will have it's struct ip head= er properly
>=C2=A0 =C2=A0 aligned.=C2=A0 We do this on allocation (and when we chec= k the length of the
>=C2=A0 =C2=A0 mbufs in the lro_nomrg case). We can't just adjust th= e clustersz in the
>=C2=A0 =C2=A0 softc, because it's also used to allocate the mbufs a= nd it needs to be
>=C2=A0 =C2=A0 the proper size for that. Since we otherwise use the size= of the mbuf
>=C2=A0 =C2=A0 (or sometimes the smaller size of the received packet) to= compute how
>=C2=A0 =C2=A0 much we can buffer, this ensures no overflows. The 2 byte= adjustment
>=C2=A0 =C2=A0 also does not affect how many packets we can receive in t= he lro_nomrg
>=C2=A0 =C2=A0 case.


Doesn't this still include at least these two un-asserted/un-documented= asumptions:

(a) mbuf space is large enough to hold 2 extra bytes?=C2=A0 Is this always<= br> =C2=A0 =C2=A0 =C2=A0the case?

I was sur= e I puzzled through all the cases correctly.=C2=A0 We adjust the length
of the available buffer by 2 and offset everything by 2. this work b= ecause
all the vtnet header types only have 1 or 2 byte data fiel= ds. It keeps us from
writing too much into the buffer.
=
However, in vtnet_rx_cluster_size, we don't adjust the f= rame size before
allocating. So if the mtu=C2=A0+ vlan_header. So= if the mtu + 12 + 18 is 2047 or 2048
or mtu =3D 2017 or 2018 we&= #39;ll get it wrong (we don't adjust in the case where
we use= vtnet_rx_header which is 14 bytes). But I think in that case, we'll &q= uot;drop
the last two bytes off the end" get it wrong (since= we adjust the total length
of the mbuf space) rather than "= overflow two bytes" get it wrong. For that
case, we'd ne= ed to add two as I indicated in the comments below.

static int
vtnet_rx_cluster_size(struct vtnet_softc *sc, in= t mtu)
{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 int framesz;

=C2=A0 =C2= =A0 =C2=A0 =C2=A0 if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MJUMPAGESIZ= E);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 else if (sc->vtnet_flags & VTNET_= FLAG_LRO_NOMRG)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = return (MCLBYTES);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0* Try to scale the receive mbuf cluster size from the M= TU. We
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* could also use the VQ size to= influence the selected size,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* but th= at would only matter for very small queues.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (vtnet_modern(sc)) {
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MPASS(sc->vtnet_hdr_siz= e =3D=3D sizeof(struct virtio_net_hdr_v1));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct virtio_net_hdr_v1);=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct vtnet_rx_header);=C2=A0
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz +=3D sizeof(struct ether_vlan_header) += mtu;
// XXX if framesz % 4 =3D=3D 2 and we're strict alignment we n= eed to add 2
=C2=A0
(b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of
=C2=A0 =C2=A0 =C2=A0bytes?=C2=A0 Could add comments or CTASSERTs?

True, I'll ctassert the sizes and say we rel= y on things being even sized
in if_vnetvar.h.
--00000000000079bbe206101c1e42--