git: d10ec3ad7739 - main - ena: do not call reset if device is unresponsive

From: Marcin Wojtas <mw_at_FreeBSD.org>
Date: Sun, 23 Jan 2022 19:50:35 UTC
The branch main has been updated by mw:

URL: https://cgit.FreeBSD.org/src/commit/?id=d10ec3ad7739a6f621d398d034632f68f647d72f

commit d10ec3ad7739a6f621d398d034632f68f647d72f
Author:     Dawid Gorecki <dgr@semihalf.com>
AuthorDate: 2022-01-03 13:50:29 +0000
Commit:     Marcin Wojtas <mw@FreeBSD.org>
CommitDate: 2022-01-23 19:48:33 +0000

    ena: do not call reset if device is unresponsive
    
    If the device becomes unresponsive, the driver will not be able to
    finish the reset process correctly. Timeout during version validation
    indicates that the device is currently not responding. In that case
    do not perform the reset and instead reschedule timer service. Because
    of that the driver will continue trying to reset the device until it
    succeeds or is detached.
    
    Submitted by: Dawid Gorecki <dgr@semihalf.com>
    Obtained from: Semihalf
    MFC after: 2 weeks
    Sponsored by: Amazon, Inc.
---
 sys/dev/ena/ena.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/sys/dev/ena/ena.c b/sys/dev/ena/ena.c
index f4abe61f08ae..1b26a91c5d9e 100644
--- a/sys/dev/ena/ena.c
+++ b/sys/dev/ena/ena.c
@@ -3278,6 +3278,18 @@ ena_timer_service(void *data)
 		ena_update_host_info(host_info, adapter->ifp);
 
 	if (unlikely(ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) {
+		/*
+		 * Timeout when validating version indicates that the device
+		 * became unresponsive. If that happens skip the reset and
+		 * reschedule timer service, so the reset can be retried later.
+		 */
+		if (ena_com_validate_version(adapter->ena_dev) ==
+		    ENA_COM_TIMER_EXPIRED) {
+			ena_log(adapter->pdev, WARN,
+			    "FW unresponsive, skipping reset\n");
+			ENA_TIMER_RESET(adapter);
+			return;
+		}
 		ena_log(adapter->pdev, WARN, "Trigger reset is on\n");
 		taskqueue_enqueue(adapter->reset_tq, &adapter->reset_task);
 		return;