From nobody Thu Dec 21 15:31:36 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SwvYL3Bcpz54vQL for ; Thu, 21 Dec 2023 15:31:50 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SwvYK6bFGz3FFm for ; Thu, 21 Dec 2023 15:31:49 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-50e33fe3856so1098812e87.1 for ; Thu, 21 Dec 2023 07:31:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1703172708; x=1703777508; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lrPfM/b3ARsahhveNgyoFzcS3lDniNCcXxaOT1wFwQM=; b=e9SlRabeisadG/gvhYAOKjZ8g3fCPD1uTfkfvtIqqE+q03BwT7iB0xkj4ZR4U+x9j+ QmtFFO1cA429XD7EHL+WBvPeUWHy3foL5YM4AP5i9tFxmbbNezsUku3naY4BIRKhsZgx SI55oT8iddMWQLH4S2W44VNgHgjXJFC3oReihI6iYH2tP7pMn+OScwklACR4E1HePMKg Be7rwktZiM2IDxPKWulwfQxG2Ni69f4pD/67le2rA1FNvcPLm9z2xkpO9yLw7c0uO9kU /Swzh7tFbyzUx/3ZppX+83VceQECjCUJGVQPR6oz7YCo/QBF5r0gXcLNdK/rO/idiILR POkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703172708; x=1703777508; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lrPfM/b3ARsahhveNgyoFzcS3lDniNCcXxaOT1wFwQM=; b=bL4seo2XCA8dnsZKKN1uKsOjTl2AAvCF5KaaRTllms56z49Om8D47dDPMROyCuSzvv N1X9TtmntucXXo1buGA0evb5rYbJxPhtdArBhZ1CoeH+tQQ/AQ2+tN/kIgCjr+rsiLPc fDjHDS3N5zIN343expEcZD/3GENJcF5hxuSvOT7Sabia0OqpI4UbcGAJUz81n5yyY07h VqjcoypwMUL+gldkRyONlJh9E+xrBcwhvh9JpN++5vCwMyU773F6ulS22iq6enOAm4iJ 9rdiRFFT8Xb51IFTPub36tuAI9BgIMjBObmPOJMa5e2thOt/n62oPuo405Mk4oOWgCz+ DVCw== X-Gm-Message-State: AOJu0Yy87nIo6J6edL7j96jwsRAkaEH39uqs0LUgRkeBT+izcFhp8V+K yA4gIQv5Mi4taCx6sYIjKYoQAlAorcVjQxLZjZLFlA== X-Google-Smtp-Source: AGHT+IHQQPFScrDB9ZiylUmdon2XtmQ+sIi1cGYaK2zObYZ+f9IG+Et8P9MLjZ75qEET/uoDkb7UrdJfvfDL6WaI5hU= X-Received: by 2002:ac2:5f5b:0:b0:50e:2ab0:3c45 with SMTP id 27-20020ac25f5b000000b0050e2ab03c45mr2491787lfz.74.1703172707394; Thu, 21 Dec 2023 07:31:47 -0800 (PST) List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 References: <202312210421.3BL4Lb9T036317@gitrepo.freebsd.org> In-Reply-To: From: Warner Losh Date: Thu, 21 Dec 2023 08:31:36 -0700 Message-ID: Subject: Re: git: 9e6d11ce9a51 - main - vtnet: Adjust rx buffer so IP header 32-bit aligned To: Mark Johnston Cc: Warner Losh , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Type: multipart/alternative; boundary="000000000000374067060d06cc61" X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Spamd-Bar: ---- X-Rspamd-Queue-Id: 4SwvYK6bFGz3FFm --000000000000374067060d06cc61 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Dec 21, 2023 at 8:13=E2=80=AFAM Mark Johnston w= rote: > On Thu, Dec 21, 2023 at 04:21:37AM +0000, Warner Losh wrote: > > The branch main has been updated by imp: > > > > URL: > https://cgit.FreeBSD.org/src/commit/?id=3D9e6d11ce9a51d75ed6a94e180f2fb4e= 9188a2ba4 > > > > commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4 > > Author: Warner Losh > > AuthorDate: 2023-12-20 19:09:09 +0000 > > Commit: Warner Losh > > CommitDate: 2023-12-21 04:16:45 +0000 > > > > vtnet: Adjust rx buffer so IP header 32-bit aligned > > > > Call madj(m, ETHER_ALIGN) to offset rx buffers when allocating them= . > > This improves performance everywhere, and allows armv7 to work at > all. > > > > PR: 271288 (PR had a different fix than I wound > up with) > > MFC After: 3 days > > Sponsored by: Netflix > > Differential Revision: https://reviews.freebsd.org/D43136 > > --- > > sys/dev/virtio/network/if_vtnet.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/sys/dev/virtio/network/if_vtnet.c > b/sys/dev/virtio/network/if_vtnet.c > > index 287af7751066..360176e4f845 100644 > > --- a/sys/dev/virtio/network/if_vtnet.c > > +++ b/sys/dev/virtio/network/if_vtnet.c > > @@ -1532,8 +1532,8 @@ vtnet_rx_alloc_buf(struct vtnet_softc *sc, int > nbufs, struct mbuf **m_tailp) > > m_freem(m_head); > > return (NULL); > > } > > - > > m->m_len =3D size; > > + m_adj(m, ETHER_ALIGN); > > The driver is expecting to get a cluster of size sc->vtnet_rx_clustersz, > but now it's getting one of size sc->vtnet_rx_clustersz - 2. I don't > see how this change can be sufficient on its own: what prevents virtio > from writing sc->vtnet_rx_clustersz bytes and thereby overwriting the > two bytes following the cluster? > The only trouble that I saw was here: /* * Every mbuf should have the expected cluster size since that * is also used to allocate the replacements. */ KASSERT(m->m_len =3D=3D clustersz, ("%s: mbuf size %d not expected cluster size %d", __func__, m->m_len, clustersz)); and even that's fine: We can adjust that assert. The next lines I think make it fine: m->m_len =3D MIN(m->m_len, len); so we only use what we say is there to land bytes into the mbuf. I'm now not sure though what to do about a few lines later: m_new =3D vtnet_rx_alloc_buf(sc, nreplace, &m_tail); if (m_new =3D=3D NULL) { m_prev->m_len =3D clustersz; return (ENOBUFS); } which likely needs to be adjusted (or the old saved size). I likely didn't see asserts in my testing because I didn't have LRO slow path packets. Do you see other places that would need adjusting? All the other places seem to use m_len correctly on the rx path, but I'm not a nic driver guy, and I might be missing something. Warner > > if (m_head !=3D NULL) { > > m_tail->m_next =3D m; > > m_tail =3D m; > --000000000000374067060d06cc61 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Dec 21, 2023 at 8:13=E2=80=AF= AM Mark Johnston <markj@freebsd.org= > wrote:
= On Thu, Dec 21, 2023 at 04:21:37AM +0000, Warner Losh wrote:
> The branch main has been updated by imp:
>
> URL: https://= cgit.FreeBSD.org/src/commit/?id=3D9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4<= /a>
>
> commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4
> Author:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> AuthorDate: 2023-12-20 19:09:09 +0000
> Commit:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org>
> CommitDate: 2023-12-21 04:16:45 +0000
>
>=C2=A0 =C2=A0 =C2=A0vtnet: Adjust rx buffer so IP header 32-bit aligned=
>=C2=A0 =C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0Call madj(m, ETHER_ALIGN) to offset rx buffers when= allocating them.
>=C2=A0 =C2=A0 =C2=A0This improves performance everywhere, and allows ar= mv7 to work at all.
>=C2=A0 =C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0PR:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0271288 (PR had a different fix than I wound up = with)
>=C2=A0 =C2=A0 =C2=A0MFC After:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 3 days
>=C2=A0 =C2=A0 =C2=A0Sponsored by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0Netflix
>=C2=A0 =C2=A0 =C2=A0Differential Revision:=C2=A0
https://revie= ws.freebsd.org/D43136
> ---
>=C2=A0 sys/dev/virtio/network/if_vtnet.c | 2 +-
>=C2=A0 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sys/dev/virtio/network/if_vtnet.c b/sys/dev/virtio/networ= k/if_vtnet.c
> index 287af7751066..360176e4f845 100644
> --- a/sys/dev/virtio/network/if_vtnet.c
> +++ b/sys/dev/virtio/network/if_vtnet.c
> @@ -1532,8 +1532,8 @@ vtnet_rx_alloc_buf(struct vtnet_softc *sc, int n= bufs, struct mbuf **m_tailp)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0m_freem(m_head);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0return (NULL);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> -
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0m->m_len =3D = size;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0m_adj(m, ETHER_ALIGN)= ;

The driver is expecting to get a cluster of size sc->vtnet_rx_clustersz,=
but now it's getting one of size sc->vtnet_rx_clustersz - 2.=C2=A0 I= don't
see how this change can be sufficient on its own: what prevents virtio
from writing sc->vtnet_rx_clustersz bytes and thereby overwriting the two bytes following the cluster?

The on= ly trouble that I saw was here:

=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Every mbuf should have the expected clu= ster size since that
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0* is also used to allocate the replacements.
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 KASSERT(m->m_len =3D=3D clustersz,=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (&qu= ot;%s: mbuf size %d not expected cluster size %d", __func__,
=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m->m_len= , clustersz));

and even that's fine: We ca= n adjust that assert. The next lines I think make it fine:
=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m->m_len =3D MIN(m->= ;m_len, len);

so we only use what we say is th= ere to land bytes into the mbuf.=C2=A0 I'm now not sure though what to = do about a few lines later:

=C2=A0 =C2=A0 =C2=A0 = =C2=A0m_new =3D vtnet_rx_alloc_buf(sc, nreplace, &m_tail);
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 if (m_new =3D=3D NULL) {
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m_prev->m_len =3D clustersz;
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (ENOBUFS);
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 }

which likely needs = to be adjusted (or the old saved size).

I likely d= idn't see asserts in my testing because I didn't have LRO slow path= packets.

Do you see other places that would need = adjusting? All the other places seem to use m_len correctly on the rx path,= but I'm not a nic driver guy, and I might be missing something.
<= div>
Warner

=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (m_head !=3D = NULL) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0m_tail->m_next =3D m;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0m_tail =3D m;
--000000000000374067060d06cc61--