From nobody Wed Apr 26 18:02:19 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Q66CW226yz4757t for ; Wed, 26 Apr 2023 18:02:31 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Q66CV3KsCz40C9 for ; Wed, 26 Apr 2023 18:02:30 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-94f32588c13so1099887766b.2 for ; Wed, 26 Apr 2023 11:02:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20221208.gappssmtp.com; s=20221208; t=1682532148; x=1685124148; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qnw8mcac1ScW2hP4IhlHNnpPvocP1d1zCaUmyT2th3w=; b=x1vAOlp/Bso8F98QGb/io2pKyJoLyZRjDWnbvB6F2MlozBeWPK2EhYF2rwHdcLaEc8 z773jBpN7w/Ke6ysIhPs9/yfy3AtlycBA4q06zj6R/k4edXVkNcxg2GRyzU1ZROocNO+ 7nIJtYU/mR1ArnhLZwAITGJLX3CkihbudF5QxKN9jwvTRKSY/05Oelps98vTlJjvPp6a 807m6aDY7a5cp75LqRmwAFqJLF0Ld5zSbrD0DzOFUuGrJPzocNL/4n65ivkKykM8dc5+ wND6dyyboHIJxlY1s2AajosfBWV8ec5Nb4WxaRUNei7bpucmdy7prd2Ncv12u/IsW5mO LrNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682532148; x=1685124148; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qnw8mcac1ScW2hP4IhlHNnpPvocP1d1zCaUmyT2th3w=; b=QUfB5pFX4wDYeFBuGEKVKO3RBYuvvlrCuPNCO3idwzLw07RPiMig+A00r3PPRi4fQh 1Ow6m+n43hFHiBCwtGrORG+D3BkAZ1pFhX/coTT/peSQFCsje9eeOhd7HuJ6LBRo3ef4 Xz8XW1iOiE1bKTX0SphXiBQuQ+5RqDl5HPV/ODPn1NBVigSgiEXFy5f0Ol9K0A5RtB29 2jCaiQ2BCFgBci9rwJdwsW58Iy7ez6R9V2fMNWMgd+AQKkMfrP5/z4Qe2/h7/bcS88eO pWYzBIm6i/Uq50n8KzPhYDXuEpnZoc2vJTq9aqGWHbLrlGr0YxsRQYrM16taGNTDYy0e 1+1g== X-Gm-Message-State: AAQBX9dajcR1PDp+XbZGHX0ZBKJlKJs7+4UN3UAjvGd4plr9RHyTckSb aXxzfQ9ZQ8SSi8Ho+g3RcdHouMasdGM1yqDRNKztqA== X-Google-Smtp-Source: AKy350a4vbm42piT4IkTYVsKzcbcXm+vienS1xOypC/G4A6qDM4NBcNmAN9mEyqpNDxeTjC5IBGSIh1sDaLJwqiZCjM= X-Received: by 2002:a17:906:b6c4:b0:94a:4fc5:4c2e with SMTP id ec4-20020a170906b6c400b0094a4fc54c2emr17597028ejb.49.1682532148451; Wed, 26 Apr 2023 11:02:28 -0700 (PDT) List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 References: <202304261724.33QHOl6x001199@gitrepo.freebsd.org> In-Reply-To: <202304261724.33QHOl6x001199@gitrepo.freebsd.org> From: Warner Losh Date: Wed, 26 Apr 2023 12:02:19 -0600 Message-ID: Subject: Re: git: ce5a210997da - main - openzfs: arm64: implement kfpu_begin/kfpu_end To: Kyle Evans Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Type: multipart/alternative; boundary="000000000000082ee305fa410b01" X-Rspamd-Queue-Id: 4Q66CV3KsCz40C9 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --000000000000082ee305fa410b01 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Apr 26, 2023 at 11:24=E2=80=AFAM Kyle Evans wr= ote: > The branch main has been updated by kevans: > > URL: > https://cgit.FreeBSD.org/src/commit/?id=3Dce5a210997da3c4064cfe162e760379= f1fa8b587 > > commit ce5a210997da3c4064cfe162e760379f1fa8b587 > Author: Kyle Evans > AuthorDate: 2023-04-26 17:23:48 +0000 > Commit: Kyle Evans > CommitDate: 2023-04-26 17:24:00 +0000 > > openzfs: arm64: implement kfpu_begin/kfpu_end > > This is part one of a fix for booting with ZFS on arm64 using > accelerated checksum implementations. Checksum benchmarking will > attempt to use the FPU, so we currently panic quickly on boot. BLAKE= 3 > is still broken, as it clobbers x18 and we promptly discover that fac= t > as soon as we attempt to fetch curthread in kfpu_end(). > > Note that _STANDALONE is special-cased here, but ideally we wouldn't = be > building the code that uses kfpu_begin()/kfpu_end() at all in the > loader > environment. > Yes. As noted elsewhere, the upstream is a mess. There's no way to say "I only want the generic implementation" for these functions. So we went through some crazy hoops to try to disable that... only to run into a buzz-saw of upstream changes that broke the crazy hoops so now we're (bogusly in my opinion) forced to include the acceleration routines... which we effectively disable here... but since the boot loader is space constrained and the link time doesn't know that these routines will never be called, they bloat the boot loader... all because there's no effective way to turn off all the accelerated versions for arm like this happens to be for x86. I've not yet had a chance to implement the obvious solution of having an option just to turn every acceleration off and not even try to compile it in because the currount cross coupling makes that insanely hard (which is how we wind up with needing the _STANDALONE stuff here: if we could just disable all smd stuff, this file would never be included and we'd not need this silly implementation). Warner > Discussed with: imp (a bit) > Differential Revision: https://reviews.freebsd.org/D39448 > --- > .../include/os/freebsd/spl/sys/simd_aarch64.h | 28 > +++++++++++++++++++++- > 1 file changed, 27 insertions(+), 1 deletion(-) > > diff --git a/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.= h > b/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h > index 7d2e2db28017..9edbc5f40455 100644 > --- a/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h > +++ b/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h > @@ -44,13 +44,39 @@ > #define _FREEBSD_SIMD_AARCH64_H > > #include > +#include > #include > +#include > #include > +#include > + > +#ifdef _STANDALONE > > #define kfpu_allowed() 0 > -#define kfpu_initialize(tsk) do {} while (0) > #define kfpu_begin() do {} while (0) > #define kfpu_end() do {} while (0) > + > +#else > + > +/* > + * XXX kfpu_allowed() should be 1, but this is pending a fix to the BLAK= E3 > + * generated assembly to avoid clobbering x18. Turn it back on after th= at > + * lands. > + */ > +#define kfpu_allowed() 0 > +#define kfpu_begin() do { > \ > + if (__predict_false(!is_fpu_kern_thread(0))) \ > + fpu_kern_enter(curthread, NULL, FPU_KERN_NOCTX); \ > +} while(0) > + > +#define kfpu_end() do { > \ > + if (__predict_false(curthread->td_pcb->pcb_fpflags & > PCB_FP_NOSAVE)) \ > + fpu_kern_leave(curthread, NULL); \ > +} while(0) > + > +#endif > + > +#define kfpu_initialize(tsk) do {} while (0) > #define kfpu_init() (0) > #define kfpu_fini() do {} while (0) > > --000000000000082ee305fa410b01 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, Apr 26, 2023 at 11:24=E2=80= =AFAM Kyle Evans <kevans@freebsd.o= rg> wrote:
https://cgit.= FreeBSD.org/src/commit/?id=3Dce5a210997da3c4064cfe162e760379f1fa8b587
commit ce5a210997da3c4064cfe162e760379f1fa8b587
Author:=C2=A0 =C2=A0 =C2=A0Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2023-04-26 17:23:48 +0000
Commit:=C2=A0 =C2=A0 =C2=A0Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2023-04-26 17:24:00 +0000

=C2=A0 =C2=A0 openzfs: arm64: implement kfpu_begin/kfpu_end

=C2=A0 =C2=A0 This is part one of a fix for booting with ZFS on arm64 using=
=C2=A0 =C2=A0 accelerated checksum implementations.=C2=A0 Checksum benchmar= king will
=C2=A0 =C2=A0 attempt to use the FPU, so we currently panic quickly on boot= .=C2=A0 BLAKE3
=C2=A0 =C2=A0 is still broken, as it clobbers x18 and we promptly discover = that fact
=C2=A0 =C2=A0 as soon as we attempt to fetch curthread in kfpu_end().

=C2=A0 =C2=A0 Note that _STANDALONE is special-cased here, but ideally we w= ouldn't be
=C2=A0 =C2=A0 building the code that uses kfpu_begin()/kfpu_end() at all in= the loader
=C2=A0 =C2=A0 environment.

Yes. As note= d elsewhere, the upstream is a mess. There's no way to say "I only= want
the generic implementation" for these functions. So we= went through some crazy hoops
to try to disable that... only to = run into a buzz-saw of upstream changes that broke
the crazy hoop= s so now we're (bogusly in my opinion) forced to include the accelerati= on
routines... which we effectively disable here... but since the= boot loader is space constrained
and the link time doesn't k= now that these routines will never be called, they bloat
the boot= loader... all because there's no effective way to turn off all the acc= elerated
versions for arm like this happens to be for x86.
<= div>
I've not yet had a chance to implement the obvious s= olution of having an option just to
turn every acceleration off a= nd not even try to compile it in because the currount cross
coupl= ing makes that insanely hard (which is how we wind up with needing the _STA= NDALONE
stuff here: if we could just disable all smd stuff, this = file would never be included and we'd
not need this silly imp= lementation).

<rant off>

<= /div>
Warner
=C2=A0
=C2=A0 =C2=A0 Discussed with: imp (a bit)
=C2=A0 =C2=A0 Differential Revision:=C2=A0 https://reviews.freebsd= .org/D39448
---
=C2=A0.../include/os/freebsd/spl/sys/simd_aarch64.h=C2=A0 =C2=A0 =C2=A0 | 2= 8 +++++++++++++++++++++-
=C2=A01 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h = b/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h
index 7d2e2db28017..9edbc5f40455 100644
--- a/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h
+++ b/sys/contrib/openzfs/include/os/freebsd/spl/sys/simd_aarch64.h
@@ -44,13 +44,39 @@
=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 _FREEBSD_SIMD_AARCH64_H

=C2=A0#include <sys/types.h>
+#include <sys/ucontext.h>
=C2=A0#include <machine/elf.h>
+#include <machine/fpu.h>
=C2=A0#include <machine/md_var.h>
+#include <machine/pcb.h>
+
+#ifdef _STANDALONE

=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_allowed()=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 0
-#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_initialize(tsk)=C2=A0 =C2=A0 do {}= while (0)
=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_begin()=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 do {} while (0)
=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_end()=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 do {} while (0)
+
+#else
+
+/*
+ * XXX kfpu_allowed() should be 1, but this is pending a fix to the BLAKE3=
+ * generated assembly to avoid clobbering x18.=C2=A0 Turn it back on after= that
+ * lands.
+ */
+#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_allowed()=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 0
+#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_begin() do {=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (__predict_false(!is_fpu_kern_thread(0)))=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fpu_kern_enter(curt= hread, NULL, FPU_KERN_NOCTX);=C2=A0 =C2=A0 =C2=A0 =C2=A0 \
+} while(0)
+
+#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_end() do {=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0\
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (__predict_false(curthread->td_pcb->pc= b_fpflags & PCB_FP_NOSAVE))=C2=A0 =C2=A0 \
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fpu_kern_leave(curt= hread, NULL);=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 \
+} while(0)
+
+#endif
+
+#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_initialize(tsk)=C2=A0 =C2=A0 do {}= while (0)
=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_init()=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0(0)
=C2=A0#define=C2=A0 =C2=A0 =C2=A0 =C2=A0 kfpu_fini()=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0do {} while (0)

--000000000000082ee305fa410b01--