From nobody Tue Jul 23 23:03:39 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WTCPS0sCLz5R6fw; Tue, 23 Jul 2024 23:03:40 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WTCPR6zz2z4bh3; Tue, 23 Jul 2024 23:03:39 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721775820; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=tfVNWpJ2YICm9+xCpIxcdvVwTMHVN54XWd9vwzsvRJE=; b=kxuDL28/wK0DwtSDd+20Bubkqgev0+d46sQNNI1P0osl609tE2+3HizrRDS5S43U2S8r6s Ooo7IKO3lAKC+ObTkUBW0neciB4VYhSyyVl2BRSsTaWqFUnevlTsVPU7MXRVFSMtGmiv3i FafRO9fi8Wcutqi5uys8zaLmHj5zPy6FV1rTHdPl25sWIP+JszmOfsEQUpY71Cp/ygYtFr c+E6YjtMlFXOZIVwBgyxuv8+sYWPfOb6Xmls737wgzNGf3mviBecDHkYRPwa+ZBleJSRGS WNrJfitztDW36UhJ2VdK04C/f8xriHxx9/Cl/k4auCdpMZOC+ZBkwfYPdHOSWA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1721775820; a=rsa-sha256; cv=none; b=oHyqueTy+fiqm8XHj5HLEGdYjwEuolSs8KG+487/Kq3gDFF/inhegr6Bd6WH+JLKmVUny3 KF27OoEM2bRGgguKK28sSSdSlO66/yXdweGzz5x7H4jdVFMzNGpSV4nteHDB0apBUUOd8n S8RP7K8HxlD9j5DxGuqgTlqOEZLosPgih+LogxBrqxziSe+SLrwh293tPqT7uRy950UfcW hfpyCAXUYRSDdH4B/IDB7DvZb3Pnb4gAfo0/amcImUKbIe1TD+C/uXai5sQBq9FTyZHEPw nyH65fmHjDYLuiI7X8L+Wl8jouLtuTjT1tke+vejqY8RRZEkDkUCnJKa6fsSXg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721775820; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=tfVNWpJ2YICm9+xCpIxcdvVwTMHVN54XWd9vwzsvRJE=; b=JVIPv8qRmsBbAq+f2EQHp/r7FteQOXVlm7W2qyxvR2Jte71qQYauHb2ZJHFveNUPJoQH6p W8bcmdz9SWdY+5ms1NA3TZrq5HQFzyXzL2fZ+Yynbzhc7QYC/gv/uFeNdrt+s66qOFBSHg IQTHRV/ZjTELv2+EwHVrDEQrAzIyqjWJ8ft25z8UqyI65oIzCuPZ5bNuSWLPjvJIbYEePE LdwBWghUEJ2bSU6D5dyUtXyFdZ6Te/0Awm/YVBjv4Y/CFZPch7giA1ko1XnGHMe/Piw8j7 VLPG68lYSelH1G0qFIDxnNvvYjQdnIS5cWu40V7gTMcqrJQzdZhH3JbeM6ZTSg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4WTCPR6cRMzmxm; Tue, 23 Jul 2024 23:03:39 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 46NN3dCQ008272; Tue, 23 Jul 2024 23:03:39 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 46NN3djK008269; Tue, 23 Jul 2024 23:03:39 GMT (envelope-from git) Date: Tue, 23 Jul 2024 23:03:39 GMT Message-Id: <202407232303.46NN3djK008269@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Warner Losh Subject: git: e6d3ba4be27d - main - nvme: Lock when processing an abort completion command. List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: imp X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: e6d3ba4be27d86c0ade250b52cf9af380f7b4c34 Auto-Submitted: auto-generated The branch main has been updated by imp: URL: https://cgit.FreeBSD.org/src/commit/?id=e6d3ba4be27d86c0ade250b52cf9af380f7b4c34 commit e6d3ba4be27d86c0ade250b52cf9af380f7b4c34 Author: Warner Losh AuthorDate: 2024-07-23 23:01:57 +0000 Commit: Warner Losh CommitDate: 2024-07-23 23:04:02 +0000 nvme: Lock when processing an abort completion command. When processing an abort completion command, we have to lock. But we have to lock the qpair of the original transaction (not the abort we're completing). We do this to avoid races with checking the completion id to tr mapping array, as well as to manually complete it. Note: we don't handle the completion status of 'Asked to abort too many transactions at once.' That will be fixed on subsequent commits. Add a note to that effect for now since it's a harder problem to solve. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D46025 --- sys/dev/nvme/nvme_qpair.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/sys/dev/nvme/nvme_qpair.c b/sys/dev/nvme/nvme_qpair.c index 8d9fb4d647c6..e4286b61a3fc 100644 --- a/sys/dev/nvme/nvme_qpair.c +++ b/sys/dev/nvme/nvme_qpair.c @@ -1007,22 +1007,35 @@ nvme_abort_complete(void *arg, const struct nvme_completion *status) struct nvme_tracker *tr = arg; /* - * If cdw0 == 1, the controller was not able to abort the command - * we requested. We still need to check the active tracker array, - * to cover race where I/O timed out at same time controller was - * completing the I/O. + * If cdw0 bit 0 == 1, the controller was not able to abort the command + * we requested. We still need to check the active tracker array, to + * cover race where I/O timed out at same time controller was completing + * the I/O. An abort command always is on the admin queue, but affects + * either an admin or an I/O queue, so take the appropriate qpair lock + * for the original command's queue, since we'll need it to avoid races + * with the completion code and to complete the command manually. */ - if (status->cdw0 == 1 && tr->qpair->act_tr[tr->cid] != NULL) { + mtx_lock(&tr->qpair->lock); + if ((status->cdw0 & 1) == 1 && tr->qpair->act_tr[tr->cid] != NULL) { /* - * An I/O has timed out, and the controller was unable to - * abort it for some reason. Construct a fake completion - * status, and then complete the I/O's tracker manually. + * An I/O has timed out, and the controller was unable to abort + * it for some reason. And we've not processed a completion for + * it yet. Construct a fake completion status, and then complete + * the I/O's tracker manually. */ nvme_printf(tr->qpair->ctrlr, "abort command failed, aborting command manually\n"); nvme_qpair_manual_complete_tracker(tr, NVME_SCT_GENERIC, NVME_SC_ABORTED_BY_REQUEST, 0, ERROR_PRINT_ALL); } + /* + * XXX We don't check status for the possible 'Could not abort because + * excess aborts were submitted to the controller'. We don't prevent + * that, either. Document for the future here, since the standard is + * squishy and only says 'may generate' but implies anything is possible + * including hangs if you exceed the ACL. + */ + mtx_unlock(&tr->qpair->lock); } static void