git: 86909f7aeb68 - main - nvme: Always lock and only avoid processing for recovery state

From: Warner Losh <imp_at_FreeBSD.org>
Date: Tue, 23 Jul 2024 23:03:38 UTC
The branch main has been updated by imp:

URL: https://cgit.FreeBSD.org/src/commit/?id=86909f7aeb68a5689e84829b0d7488f77b539846

commit 86909f7aeb68a5689e84829b0d7488f77b539846
Author:     Warner Losh <imp@FreeBSD.org>
AuthorDate: 2024-07-23 23:01:46 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-07-23 23:04:02 +0000

    nvme: Always lock and only avoid processing for recovery state
    
    When we lose a race with the timeout code, shift towards waiting for
    that timeout code to complete so we can acquire the lock. This way we
    can make sure we're in 'normal' mode before processing I/O
    completions. If we're not in 'normal' mode, then we're resetting and we
    should avoid completions.
    
    Sponsored by: Netflix
    Reviewed by: gallatin
    Differential Revision:  https://reviews.freebsd.org/D46024
---
 sys/dev/nvme/nvme_qpair.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sys/dev/nvme/nvme_qpair.c b/sys/dev/nvme/nvme_qpair.c
index 755be993cee0..8d9fb4d647c6 100644
--- a/sys/dev/nvme/nvme_qpair.c
+++ b/sys/dev/nvme/nvme_qpair.c
@@ -690,7 +690,7 @@ _nvme_qpair_process_completions(struct nvme_qpair *qpair)
 bool
 nvme_qpair_process_completions(struct nvme_qpair *qpair)
 {
-	bool done;
+	bool done = false;
 
 	/*
 	 * Interlock with reset / recovery code. This is an usually uncontended
@@ -698,12 +698,12 @@ nvme_qpair_process_completions(struct nvme_qpair *qpair)
 	 * and to prevent races with the recovery process called from a timeout
 	 * context.
 	 */
-	if (!mtx_trylock(&qpair->recovery)) {
-		qpair->num_recovery_nolock++;
-		return (false);
-	}
+	mtx_lock(&qpair->recovery);
 
-	done = _nvme_qpair_process_completions(qpair);
+	if (__predict_true(qpair->recovery_state == RECOVERY_NONE))
+		done = _nvme_qpair_process_completions(qpair);
+	else
+		qpair->num_recovery_nolock++;	// XXX likely need to rename
 
 	mtx_unlock(&qpair->recovery);