From nobody Wed Apr 02 03:06:40 2025 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZS8sX5qBMz5sX8H for ; Wed, 02 Apr 2025 03:06:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZS8sX51Cmz3hvv for ; Wed, 02 Apr 2025 03:06:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1743563200; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DvCUohN6fP5A88FumtjWkq7nAp9cZ7xOs06HQWXR1Ds=; b=h83emgzOD1kV3J9J6J8L4eF+TcaB3OKkVIPT+w6/mZctlMwqWyQaHCHJu9Fbzo8reVSIJj 50B3lmJWM+HpriUa5UL3DFhP1BNW8iQUF7Qsk8ORT9mU7vbrv9s+MahsyAVc2HfBS4NTNp PfY4H8aweXzE34BA51O0XgqyHOVOVKqm5LwlTzXA0TbgMYXOsRtvYSCnUZEbr+1HTVmUW/ VQURq9hD0ccrQDGPIvqZvl4DehAiHSVfegb4BT39wMvCKUokGclDVCM7cn7qFI3ho2SHHn XpoKigTKAOgHtXGoxzMa6fMHTCJLjtm3KBJbihQJ3CDSEosnwaChaniUEQQbYQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1743563200; a=rsa-sha256; cv=none; b=afMM1hnw66jmVk2ELSynCqPfatjqSnuVfpA3AsH/CD3643LQzyVAz5Be3q9yDbmAKC48Mg LcP6dQj30qqj0zsKyn1DHT75YWd2Mg0svb/m9U9wC6y+F6iEhx0bDkSyWe0FTUx8kJjQJH TZQrwvXiE+/gCCxa5loeItteeHmDrq5jiGTTrO/+qQHMAL7W3/YGvnVpAkPq2f278q4heE Q4kkXf18CWTXuNrXBt0VY4g26QCT68s1f1ojntiWbOITRr9o1ONPmIV0Z0vSRf+FcB/PJm hPNx41X3f4K63YC68flVdhobuxzfc7skxMKvi3RYcWiA8CeYSuP+ZJP/dshIsw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1743563200; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DvCUohN6fP5A88FumtjWkq7nAp9cZ7xOs06HQWXR1Ds=; b=L5mSf1B9iUTkePWgkyO7q48c67VJjpXtgffc/kOYpSDRfrOLQD1sfBobaJvdlT2I6YNE5d 9p4+Iy0KIG0Qi5Fk8gBQz7VXp5PY1BZvSMeZ6mfZUtfll6EdNfvNc/XdS5p3eOp/hfz1li 6nMdQG5WxJ/JT1z6m+SpcSjWp9dwlZmdHcPY385KeFI8XdVBdZkUoenlzFV4H/pe0LJbiK l+HPfbadUxM5shWGfS7IpquKfSOdHNZ6wEHoYUuWdPW0gltnfVPpiur3o8AyelHk1TJNrj 3mGfEI0RQK71Spi/WzPOvMktcCFXdBc1qJyVT1dPPKWxzUV6UMgLM6GuCmsk6A== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZS8sX4KcczygM for ; Wed, 02 Apr 2025 03:06:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 53236edt022016 for ; Wed, 2 Apr 2025 03:06:40 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 53236e9i022015 for bugs@FreeBSD.org; Wed, 2 Apr 2025 03:06:40 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 283189] Sporadic NVMe DMAR faults since updating to 14.2-STABLE Date: Wed, 02 Apr 2025 03:06:40 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.2-STABLE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: jah@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D283189 --- Comment #5 from Jason A. Harmening --- (In reply to Konstantin Belousov from comment #4) I'm not sure where these small writes are coming from, but they do appear t= o be happening in service of some file system function, as these faults are alwa= ys followed by a syslog entry like this: Mar 28 23:14:52 corona ZFS[29596]: vdev I/O failure, zpool=3Dzroot path=3D/dev/nda0p4 offset=3D3014377947136 size=3D4096 error=3D5 I'm not sure if this is some sort of metadata write, or if the block layer = is somehow splitting up these transfers in a strange way, or if it's something else. The GAS address reported by the DMAR fault always matches the PRP1 bus addr= ess, for example: Mar 28 23:14:52 corona kernel: DMAR4: nvme0: WRITE sqid:7 cid:119 nsid:1 lba:5892185760 len:8 Mar 28 23:14:52 corona kernel: nvme0: pci7:0:0 sid 700 fault acc 1 adt 0x0 reason 0x6 addr fef05000 Mar 28 23:14:52 corona kernel: nvme0: nsid:0x1 rsvd2:0 rsvd3:0 mptr:0 prp1:0xfef05000 prp2:0 Mar 28 23:14:52 corona kernel: nvme0: cdw10: 0x5f339ea0 cdw11:0x1 cdw12:0x7 cdw13:0 cdw14:0 cdw15:0 Mar 28 23:14:52 corona kernel: nvme0: DATA TRANSFER ERROR (00/04) crd:0 m:1 dnr:1 p:1 sqid:7 cid:119 cdw0:0 Mar 28 23:14:52 corona kernel: (nda0:nvme0:0:0:1): WRITE. NCB: opc=3D1 fuse= =3D0 nsid=3D1 prp1=3D0 prp2=3D0 cdw=3D5f339ea0 1 7 0 0 0 Mar 28 23:14:52 corona kernel: (nda0:nvme0:0:0:1): CAM status: Unknown (0x4= 20) Mar 28 23:14:52 corona kernel: (nda0:nvme0:0:0:1): Error 5, Retries exhaust= ed Mar 28 23:14:52 corona ZFS[29596]: vdev I/O failure, zpool=3Dzroot path=3D/dev/nda0p4 offset=3D3014377947136 size=3D4096 error=3D5 This doesn't look to be address space wraparound; the last several faulting= GAS addresses from the syslog are (in order): fef03000 fef03000 fef03000 fef05000 fef05000 a5205000 Since NVMe rapidly sets up and tears down mappings, this just looks more li= ke the expected behavior of the IOMMU repeatedly handing out the same addresse= s. I'll post the (E)CAP reports when I can get in front of the machine to do a verbose boot. --=20 You are receiving this mail because: You are the assignee for the bug.=