From nobody Mon Jan 29 17:14:41 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TNw0J1KHpz593gn for ; Mon, 29 Jan 2024 17:14:56 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TNw0G5h1yz4tjZ for ; Mon, 29 Jan 2024 17:14:54 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-55f0367b15fso1534351a12.0 for ; Mon, 29 Jan 2024 09:14:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1706548492; x=1707153292; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=sjl8JQg+UhxhhYlZFwexu+otz5WIqJ/HyVJCJhRBpko=; b=PtKrWLLacr3m1iZcRb4kZUNXzHXV05Tb+DjsvVPomyHBVEIlBdyDbVQcphND0Q7rBn oSFlURDDN99sNfQOX59wFHX8GsT+TCumiIu53VUJPvdxdtKk/rWkiTijqfz1yvHf9ZEZ LBdD/ci1jJ1BACP4B2NSSOdnVThpv0cpC4P4Fv14xz2RcUqHvmXOQlELPNeiAXQxguML 25E3NQoLXWDj78Fji7W8Euf4n9OdFS0uy17BjS2C8VPDVzauQ9I0GVAgFtI1aXjS9PA6 Qg9HJDLdwUSm0AaQUO2BaYY8zElfchymRMA8CDAq4Rwko/7y0IaVAKh4iO8OU5829T8q C/ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706548492; x=1707153292; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sjl8JQg+UhxhhYlZFwexu+otz5WIqJ/HyVJCJhRBpko=; b=prfEKS0GBTvQqG6mYUJPgwvR+u52WcTNtf6oPNBDNLSAgy/CkpsUn5ob4itu6cwoQr wohRspQe8knxeSrGiww4m/CeYLArlZk0GMZvseilxCfRBawY0LBXbM3Tuy9HXxfvSX3Y pufQDEILZRouAapYd6qjA7J5P9LmgqXxcNSDmhcdXFHqY1TikWBXncPWfpUgk+iooEPU UojOKand0u4jmaij/W/Dy4VmEYvXVdZyS6MZx+KmPNeWF5gXbk16O2mGTXv07YbSDSD6 /764KK1OEwPT0hVCLWnTFwx4dIxdWx1ABFPuR/Gc1vawGy/x1SmMlRnG1ZSNE5yIouhq I1UQ== X-Gm-Message-State: AOJu0YwUUXIhVjVNK52vT12aggg5qDv8ZY49+SGmd3pmNNnBNa8qCchp H8WGOBGpI4OgVmGpA7i0zKWo2sMTWXX8/JPRWQUGuwTkB8ofpQydBH1rSNJoSzAEUiAKJ8xxF+n oZRR6RU2hPCcJi1ef9IPGPQSc+HrCkn/m4/mBiVxKPDJaRupG6wUBHA== X-Google-Smtp-Source: AGHT+IHP1Ca7XDB6Y3BoKU7TP3IlXuypaxg/CxJCqujD9RW/hBX8xqgyJHEf5Bk3r0MY1Hy1nGq8HFinxXA+fZMjFbw= X-Received: by 2002:aa7:c883:0:b0:55f:6b8:b2fb with SMTP id p3-20020aa7c883000000b0055f06b8b2fbmr1813718eds.1.1706548492372; Mon, 29 Jan 2024 09:14:52 -0800 (PST) List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 References: <202401290514.40T5Eb1i061789@gitrepo.freebsd.org> In-Reply-To: From: Warner Losh Date: Mon, 29 Jan 2024 10:14:41 -0700 Message-ID: Subject: Re: git: 3be59adbb5a2 - main - vtnet: Adjust for ethernet alignment. To: "Bjoern A. Zeeb" Cc: Warner Losh , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Type: multipart/alternative; boundary="000000000000ae2099061018c823" X-Rspamd-Queue-Id: 4TNw0G5h1yz4tjZ X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] --000000000000ae2099061018c823 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jan 29, 2024 at 8:26=E2=80=AFAM Bjoern A. Zeeb < bzeeb-lists@lists.zabbadoz.net> wrote: > On Mon, 29 Jan 2024, Warner Losh wrote: > > > The branch main has been updated by imp: > > > > URL: > https://cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc822= 86e507e95 > > > > commit 3be59adbb5a2ae7600d46432d3bc82286e507e95 > > Author: Warner Losh > > AuthorDate: 2024-01-29 05:08:55 +0000 > > Commit: Warner Losh > > CommitDate: 2024-01-29 05:08:55 +0000 > > > > vtnet: Adjust for ethernet alignment. > > > > If the header that we add to the packet's size is 0 % 4 and we're > > strictly aligning, then we need to adjust where we store the header = so > > the packet that follows will have it's struct ip header properly > > aligned. We do this on allocation (and when we check the length of > the > > mbufs in the lro_nomrg case). We can't just adjust the clustersz in > the > > softc, because it's also used to allocate the mbufs and it needs to = be > > the proper size for that. Since we otherwise use the size of the mbu= f > > (or sometimes the smaller size of the received packet) to compute ho= w > > much we can buffer, this ensures no overflows. The 2 byte adjustment > > also does not affect how many packets we can receive in the lro_nomr= g > > case. > > > Doesn't this still include at least these two un-asserted/un-documented > asumptions: > > (a) mbuf space is large enough to hold 2 extra bytes? Is this always > the case? > I was sure I puzzled through all the cases correctly. We adjust the length of the available buffer by 2 and offset everything by 2. this work because all the vtnet header types only have 1 or 2 byte data fields. It keeps us from writing too much into the buffer. However, in vtnet_rx_cluster_size, we don't adjust the frame size before allocating. So if the mtu + vlan_header. So if the mtu + 12 + 18 is 2047 or 2048 or mtu =3D 2017 or 2018 we'll get it wrong (we don't adjust in the case whe= re we use vtnet_rx_header which is 14 bytes). But I think in that case, we'll "drop the last two bytes off the end" get it wrong (since we adjust the total length of the mbuf space) rather than "overflow two bytes" get it wrong. For that case, we'd need to add two as I indicated in the comments below. static int vtnet_rx_cluster_size(struct vtnet_softc *sc, int mtu) { int framesz; if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS) return (MJUMPAGESIZE); else if (sc->vtnet_flags & VTNET_FLAG_LRO_NOMRG) return (MCLBYTES); /* * Try to scale the receive mbuf cluster size from the MTU. We * could also use the VQ size to influence the selected size, * but that would only matter for very small queues. */ if (vtnet_modern(sc)) { MPASS(sc->vtnet_hdr_size =3D=3D sizeof(struct virtio_net_hdr_v1)); framesz =3D sizeof(struct virtio_net_hdr_v1); } else framesz =3D sizeof(struct vtnet_rx_header); framesz +=3D sizeof(struct ether_vlan_header) + mtu; // XXX if framesz % 4 =3D=3D 2 and we're strict alignment we need to add 2 // XXX or equivalently, if vnet_hdr_size % 4 =3D=3D 0 and ... if (framesz <=3D MCLBYTES) return (MCLBYTES); else if (framesz <=3D MJUMPAGESIZE) return (MJUMPAGESIZE); else if (framesz <=3D MJUM9BYTES) return (MJUM9BYTES); /* Sane default; avoid 16KB clusters. */ return (MCLBYTES); } Do you agree? Is this what you are worried about? It's the only hole I could find this morning (after going over this a dozen other times trying to get it right for the review, and bryanv was happy neither noticed). It also explains why my tests work: I didn't try to have a weird mtu of 2018 bytes. > (b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of > bytes? Could add comments or CTASSERTs? > True, I'll ctassert the sizes and say we rely on things being even sized in if_vnetvar.h. Warner > > PR: 271288 > > Sponsored by: Netflix > > Reviewed by: bryanv > > Differential Revision: https://reviews.freebsd.org/D43224 > > -- > Bjoern A. Zeeb r15:7 > --000000000000ae2099061018c823 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Mon, Jan 29, 2024 at 8:26=E2=80=AF= AM Bjoern A. Zeeb <bze= eb-lists@lists.zabbadoz.net> wrote:
On Mon, 29 Jan 2024, Warner Losh wrote:

> The branch main has been updated by imp:
>
> URL: https://= cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc82286e507e95<= /a>
>
> commit 3be59adbb5a2ae7600d46432d3bc82286e507e95
> Author:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> AuthorDate: 2024-01-29 05:08:55 +0000
> Commit:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> CommitDate: 2024-01-29 05:08:55 +0000
>
>=C2=A0 =C2=A0 vtnet: Adjust for ethernet alignment.
>
>=C2=A0 =C2=A0 If the header that we add to the packet's size is 0 %= 4 and we're
>=C2=A0 =C2=A0 strictly aligning, then we need to adjust where we store = the header so
>=C2=A0 =C2=A0 the packet that follows will have it's struct ip head= er properly
>=C2=A0 =C2=A0 aligned.=C2=A0 We do this on allocation (and when we chec= k the length of the
>=C2=A0 =C2=A0 mbufs in the lro_nomrg case). We can't just adjust th= e clustersz in the
>=C2=A0 =C2=A0 softc, because it's also used to allocate the mbufs a= nd it needs to be
>=C2=A0 =C2=A0 the proper size for that. Since we otherwise use the size= of the mbuf
>=C2=A0 =C2=A0 (or sometimes the smaller size of the received packet) to= compute how
>=C2=A0 =C2=A0 much we can buffer, this ensures no overflows. The 2 byte= adjustment
>=C2=A0 =C2=A0 also does not affect how many packets we can receive in t= he lro_nomrg
>=C2=A0 =C2=A0 case.


Doesn't this still include at least these two un-asserted/un-documented= asumptions:

(a) mbuf space is large enough to hold 2 extra bytes?=C2=A0 Is this always<= br> =C2=A0 =C2=A0 =C2=A0the case?

I was sur= e I puzzled through all the cases correctly.=C2=A0 We adjust the length
of the available buffer by 2 and offset everything by 2. this work b= ecause
all the vtnet header types only have 1 or 2 byte data fiel= ds. It keeps us from
writing too much into the buffer.
=
However, in vtnet_rx_cluster_size, we don't adjust the f= rame size before
allocating. So if the mtu=C2=A0+ vlan_header. So= if the mtu + 12 + 18 is 2047 or 2048
or mtu =3D 2017 or 2018 we&= #39;ll get it wrong (we don't adjust in the case where
we use= vtnet_rx_header which is 14 bytes). But I think in that case, we'll &q= uot;drop
the last two bytes off the end" get it wrong (since= we adjust the total length
of the mbuf space) rather than "= overflow two bytes" get it wrong. For that
case, we'd ne= ed to add two as I indicated in the comments below.

static int
vtnet_rx_cluster_size(struct vtnet_softc *sc, in= t mtu)
{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 int framesz;

=C2=A0 =C2= =A0 =C2=A0 =C2=A0 if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MJUMPAGESIZ= E);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 else if (sc->vtnet_flags & VTNET_= FLAG_LRO_NOMRG)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = return (MCLBYTES);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0* Try to scale the receive mbuf cluster size from the M= TU. We
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* could also use the VQ size to= influence the selected size,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* but th= at would only matter for very small queues.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (vtnet_modern(sc)) {
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MPASS(sc->vtnet_hdr_siz= e =3D=3D sizeof(struct virtio_net_hdr_v1));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct virtio_net_hdr_v1);=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct vtnet_rx_header);=C2=A0
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz +=3D sizeof(struct ether_vlan_header) += mtu;
// XXX if framesz % 4 =3D=3D 2 and we're strict alignment we n= eed to add 2
=C2=A0
(b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of
=C2=A0 =C2=A0 =C2=A0bytes?=C2=A0 Could add comments or CTASSERTs?

True, I'll ctassert the sizes and say we rel= y on things being even sized
in if_vnetvar.h.
--000000000000ae2099061018c823--