From nobody Wed May 25 15:49:56 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 504761B467E0 for ; Wed, 25 May 2022 15:50:15 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vs1-xe2f.google.com (mail-vs1-xe2f.google.com [IPv6:2607:f8b0:4864:20::e2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4L7b9y3pylz4d7K for ; Wed, 25 May 2022 15:50:14 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vs1-xe2f.google.com with SMTP id 68so14225613vse.11 for ; Wed, 25 May 2022 08:50:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ajvmJ7jYrFnBzqZCMqFlvIy00a8dJIZmtHxDTvuqJac=; b=x/6jboQ+K2PwktSZCrBFEUpKLyrtAcWg4tjChO0oG8EbaF92SZpSZ6vcJSRAOWNb0G eL+HJh9OvxewRVz639jQO3fWBVG27ORUnStb99pKECN1NZAw2IatfvslTTn3BINeDlNJ 68d1qMSIQ4WiYfxStyn1EQeX5Oj6iAGSeawD20IbjhlDMWJafEGFdEXCO1aZHqMW7kwn Rup4i8GcJYY+VAgb3su+8Km3HMjuq9z20qlSKt+2tOaETrP2+slQtJmro8JIrfVY825x h5BK1fgoCcsf0bAeLNRrj7/jDOOvh58EgZ3QzApvHiaIQVkK2Daa7HRJrzm218ElG/TX UF6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ajvmJ7jYrFnBzqZCMqFlvIy00a8dJIZmtHxDTvuqJac=; b=503HmWgdZ/vbKmtshXeqGlFbpt6qro2jc4qvSdlHnGgqnZhtgFfwi+rcBAbBS9s1te pn+WHAySaDakY0T7X5dDLU62exwmm403tX/bJrlgYh1HynMs8Rpo2wC5hUw7C4lmY/+O ts/iiaTK2SaMSWRhbe9J1s+JY+uUeEDLqzMX4SZojuGWjgMPS/8vKbSmaEZGZUzJwGSp rm50tsDLmccsmS6uXoemfydOFijanv63Vb/L/wHH2NwsBvaReLkW6Ww2lbafOtR+ohf8 Sv9fu1nGzZ1ibhdUTqQf79HokaM3hd1pXICZHSy3lcmnhtHt6+KYgEykygfK+gbfgxTb lg6Q== X-Gm-Message-State: AOAM532B9KRDvXboi0rxnHCwEKECBC8dQ8tDCc0NKu0Zuav3Zcbiu5sn DVyIlxC3hxRChUQqtx0+/jyAQzHOq8ZpbkdEoEV+Ow== X-Google-Smtp-Source: ABdhPJz6XdwxBGECtq644A00m47x467dxN9Sex3a1ciopswABCPPhLquQNNpiA95/XQL4gBXu8KKk+OROyWA2WYpwME= X-Received: by 2002:a67:f8ce:0:b0:335:d520:ab7f with SMTP id c14-20020a67f8ce000000b00335d520ab7fmr13185892vsp.51.1653493808391; Wed, 25 May 2022 08:50:08 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <20220525122529.t2kwfg2q65dfiyyt@host-ubertino-mac-88e9fe7361f5.eduroam.ssid.10net.amherst.edu> <20220526001715.4ffee96a@ws1.wobblyboot.net> <20220525153920.sxzi7fhsfzv6yidv@ubertino.local> In-Reply-To: <20220525153920.sxzi7fhsfzv6yidv@ubertino.local> From: Warner Losh Date: Wed, 25 May 2022 09:49:56 -0600 Message-ID: Subject: Re: nvme INVALID_FIELD in dmesg.boot To: Matteo Riondato Cc: matti k , Alexander Motin , FreeBSD Current , Jim Harris Content-Type: multipart/alternative; boundary="0000000000001678bb05dfd807e0" X-Rspamd-Queue-Id: 4L7b9y3pylz4d7K X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20210112.gappssmtp.com header.s=20210112 header.b="x/6jboQ+"; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::e2f) smtp.mailfrom=wlosh@bsdimp.com X-Spamd-Result: default: False [-3.00 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20210112.gappssmtp.com:s=20210112]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; RCPT_COUNT_FIVE(0.00)[5]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20210112.gappssmtp.com:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::e2f:from]; MLMMJ_DEST(0.00)[freebsd-current]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_SPF_NA(0.00)[no SPF record]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N --0000000000001678bb05dfd807e0 Content-Type: text/plain; charset="UTF-8" On Wed, May 25, 2022 at 9:39 AM Matteo Riondato wrote: > On 2022-05-25 at 11:29 EDT, Warner Losh wrote: > > > >SET FEATURES (opcode 9) feature 0xb is indeed async event > >configuration. > >0x31f is: > >SMART WARNING for available spares (0x1) > >SMART warning for temperature (0x2) > >SMART WARNING for device reliability (0x4) > >SMART WARNING for being read only (0x8) > >SMART WARNING for volatile memory backup (0x10) > >Namespace attribute change events (0x100) > >Firmware activation events (0x200) > > > >I wonder which one of those it doesn't like. My reading of the standard > >suggests that those should always be supported for a 1.2 and later > >drive... Thought maybe with the possible exception of the volatile > >memory backup, so let me do some digging here... > > > >We can get the last two items from OAES field of the controller > >identificaiton data. This is bytes 95:92, which if I'm counting right > >is the last word on the 040: line in the nvmecontrol identify -x nvmeX > >command: > > > >040: 4e474e4b 30303150 000cca07 00230000 00010200 005b8d80 0030d400 > >00000100 > > >----------------------------------------------------------------------------------------------------------^^^^^^^^^ > > On my system: > > 040: 31564456 30373130 5cd2e400 00000500 00010200 001e8480 002dc6c0 > 00000200 > Yea, 0x200 and we send 0x300, so maybe that's the cause of the message.... > (same for all nvmeX, as far as I can tell) > > >It looks like we don't currently test these bits before we add the last > >two (we do it unconditionally for >= 1.2, and maybe we should check > >these bits >= 1.2). > > > >Would you be able to test a fix for this? > > Yes, I would be happy to, but I cannot do it for a couple of weeks > (running simulations for a deadline). > There's no real rush... Your system will be fine without these events given what I think you are doing with it. You might want to check the smart log page to see if any of the drives have indicators of trouble... but most trouble you'd care about would likely torpedo your simulation very very shortly after they happen so even that likely isn't strictly required. Warner --0000000000001678bb05dfd807e0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, May 25, 2022 at 9:39 AM Matte= o Riondato <matteo@freebsd.org= > wrote:
On 2= 022-05-25 at 11:29 EDT, Warner Losh <imp@bsdimp.com> wrote:
>
>SET FEATURES (opcode 9) feature 0xb is indeed async event
>configuration.
>0x31f is:
>SMART WARNING for available spares (0x1)
>SMART warning for temperature (0x2)
>SMART WARNING for device reliability (0x4)
>SMART WARNING for being read only (0x8)
>SMART WARNING for volatile memory backup (0x10)
>Namespace attribute change events (0x100)
>Firmware activation events (0x200)
>
>I wonder which one of those it doesn't like. My reading of the stan= dard
>suggests that those should always be supported for a 1.2 and later
>drive... Thought maybe with the possible exception of the volatile
>memory backup, so let me do some digging here...
>
>We can get the last two items from OAES field of the controller
>identificaiton data. This is bytes 95:92, which if I'm counting rig= ht
>is the last word on the 040: line in the nvmecontrol identify -x nvmeX =
>command:
>
>040: 4e474e4b 30303150 000cca07 00230000 00010200 005b8d80 0030d400 >00000100
>-----------------------------------------------------------------------= -----------------------------------^^^^^^^^^

On my system:

040: 31564456 30373130 5cd2e400 00000500 00010200 001e8480 002dc6c0
00000200

Yea, 0x200 and we send 0x300, = so maybe that's the cause of the message....
=C2=A0
(same for all nvmeX, as far as I can tell)

>It looks like we don't currently test these bits before we add the = last
>two (we do it unconditionally for >=3D 1.2, and maybe we should chec= k
>these bits >=3D 1.2).
>
>Would you be able to test a fix for this?

Yes, I would be happy to, but I cannot do it for a couple of weeks
(running simulations for a deadline).

T= here's=C2=A0 no real rush... Your system will be fine without these eve= nts given what
I think you are doing with it. You might want to c= heck the smart log page to see
if any of the drives have indicato= rs of trouble... but most trouble you'd care about
would like= ly torpedo your simulation very very shortly after they happen so even
that likely isn't strictly required.

War= ner
--0000000000001678bb05dfd807e0--