From nobody Wed Dec 27 12:32:44 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4T0WHw4fkLz55YGs; Wed, 27 Dec 2023 12:32:44 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4T0WHw4PCsz4TyH; Wed, 27 Dec 2023 12:32:44 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1703680364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=N409MeKC+OZvvHX0ukkhrsYgmmeQUENgCJeLqxdNzTE=; b=LyWarcsHxER1s7SWwEE7sBYbW/tjCfApdiwASlWziJFr+jHZaPBq1OfF4w5LQVh+3aqRLM Sb4rEnVaAz6n+x3IVEGR3QLVRNRw9uTpHPJEceQOsf9aYSPcT7wvWn8AV0HoSL3KGS8n2k XL+lKCHCKfBGi5/i4NJ4U8JL7Wrhj6loUddqcVBXr1tCvuP/1JQRTfZ4am1gflIIfyTkVB XC+Xot7FX18R+gky5Viav/d7wFbu32TGgAUmoIwc1drsV5oRuum1maWdzSYpz69PVw6xvK HKLesFpqk60U+s6p/FZ+asouTJO/v12LpSO2MNDVJUBxBFWiT4mzkrdzUsN1Rg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1703680364; a=rsa-sha256; cv=none; b=RPK1pQLC7usWd6VEqRkxcF11D4cGWHuIb3+JzBrMWjX1uarbNh63TN6BGUaVetQh9Zhpt3 NAYAZ/z2qOYHmdEMInqPimIu7KlIgXPHkQKMXNzazKhEPAIkqCZ1dXj54zyzCpQ89qhVur trk+GcmDHQC2kdj2fhd+lcTGaIzDB5qnFJL969F7fa4adBlJbxhX0IDeLXFHfJSd7r7u7O uf1iAdrldJfbPPQ/jViYlvAPy7HlKGVnZNiszyNWALcXIK/LClE5HpfYHlPCNjy8JNMs69 KuIQ6yoG7Nz+wcNWvTUi0roZS+SA12ELFY86xR0vOfvC/HtNwFLQ3w4nVGp6RQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1703680364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=N409MeKC+OZvvHX0ukkhrsYgmmeQUENgCJeLqxdNzTE=; b=Lg5LtHHXbi/j+0V5RMOuxtmEwgReO0SRE22oUJx4wuKYTCu1MQ8lArxOKRqht/T6NpfhZS 0W8wbZS6lBOL2zIXLXRkoddMRZJLDflK8GK1HHl+IQWw7Dn1Sgb1urNuB0OiOwhOofxup7 HnKF0iFnm/6PSCKeJBFm79wCvGz0A0Ozr4dcSkhlEOKqmTbmXBmdIUY+DyTx5iDzgnaASS Ilc6Gn2bGP6S+29qR+1VTp+DTEoKbN9f5Xi1CXCwGGQAk4ZupJOtw/qWVLhfmqeCeNmuvk ddxgDkmIuk9OqpmoACEpvsRqmhIZtJ/KQETUneIkBjFMCsHo1KCWiMc/ybhOkg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4T0WHw3RT5zf4C; Wed, 27 Dec 2023 12:32:44 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3BRCWiU2077971; Wed, 27 Dec 2023 12:32:44 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3BRCWiW3077968; Wed, 27 Dec 2023 12:32:44 GMT (envelope-from git) Date: Wed, 27 Dec 2023 12:32:44 GMT Message-Id: <202312271232.3BRCWiW3077968@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Ram Kishore Vegesna Subject: git: 104eae582c91 - stable/13 - ocs_fc: IO timeout handling and error reporting fix. List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: ram X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 104eae582c91f23d06b8de4b7872b8bc7fa13be7 Auto-Submitted: auto-generated The branch stable/13 has been updated by ram: URL: https://cgit.FreeBSD.org/src/commit/?id=104eae582c91f23d06b8de4b7872b8bc7fa13be7 commit 104eae582c91f23d06b8de4b7872b8bc7fa13be7 Author: Ram Kishore Vegesna AuthorDate: 2023-12-12 15:22:58 +0000 Commit: Ram Kishore Vegesna CommitDate: 2023-12-27 12:27:46 +0000 ocs_fc: IO timeout handling and error reporting fix. Hardware timeout uses a 8-bit timeout value and expects the timeout to be less than 255 seconds. Added software timer implemetation to timeout and abort the IOs with timeout more than 255 seconds. Fix the timeout problem by dividing CAM timeouts by 1000 as hardware expects timeout value in seconds. Before this change, CAM timeouts in milliseconds were getting truncated to 8 bits and converted to seconds. So the actual timeout used when going down to the card would depend on the bottom 8 bits of the timeout used. Add the mapping of ocs_fc error status to CAM status. Reported by: ken Reviewed by: ken Tested by: ken, ram Approved by: ken MFC after: 1 week (cherry picked from commit 70547544ce931357c980be47d937e5b57a2d7f49) --- sys/dev/ocs_fc/ocs_cam.c | 87 ++++++++++++++++++++++++++++++++++++++++++++-- sys/dev/ocs_fc/ocs_hw.c | 70 +++++++++++++++++++++++-------------- sys/dev/ocs_fc/ocs_hw.h | 10 +++--- sys/dev/ocs_fc/ocs_scsi.c | 6 +++- sys/dev/ocs_fc/ocs_xport.c | 8 +++-- sys/dev/ocs_fc/sli4.h | 2 +- 6 files changed, 144 insertions(+), 39 deletions(-) diff --git a/sys/dev/ocs_fc/ocs_cam.c b/sys/dev/ocs_fc/ocs_cam.c index 4da1b6669047..0fa94083e898 100644 --- a/sys/dev/ocs_fc/ocs_cam.c +++ b/sys/dev/ocs_fc/ocs_cam.c @@ -44,6 +44,7 @@ #include "ocs.h" #include "ocs_scsi.h" #include "ocs_device.h" +#include /* Default IO timeout value for initiators is 30 seconds */ #define OCS_CAM_IO_TIMEOUT 30 @@ -55,6 +56,27 @@ typedef struct { int32_t rc; } ocs_dmamap_load_arg_t; +struct ocs_scsi_status_desc { + ocs_scsi_io_status_e status; + const char *desc; +} ocs_status_desc[] = { + { OCS_SCSI_STATUS_GOOD, "Good" }, + { OCS_SCSI_STATUS_ABORTED, "Aborted" }, + { OCS_SCSI_STATUS_ERROR, "Error" }, + { OCS_SCSI_STATUS_DIF_GUARD_ERROR, "DIF Guard Error" }, + { OCS_SCSI_STATUS_DIF_REF_TAG_ERROR, "DIF REF Tag Error" }, + { OCS_SCSI_STATUS_DIF_APP_TAG_ERROR, "DIF App Tag Error" }, + { OCS_SCSI_STATUS_DIF_UNKNOWN_ERROR, "DIF Unknown Error" }, + { OCS_SCSI_STATUS_PROTOCOL_CRC_ERROR, "Proto CRC Error" }, + { OCS_SCSI_STATUS_NO_IO, "No IO" }, + { OCS_SCSI_STATUS_ABORT_IN_PROGRESS, "Abort in Progress" }, + { OCS_SCSI_STATUS_CHECK_RESPONSE, "Check Response" }, + { OCS_SCSI_STATUS_COMMAND_TIMEOUT, "Command Timeout" }, + { OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED, "Timed out and Aborted" }, + { OCS_SCSI_STATUS_SHUTDOWN, "Shutdown" }, + { OCS_SCSI_STATUS_NEXUS_LOST, "Nexus Lost" } +}; + static void ocs_action(struct cam_sim *, union ccb *); static void ocs_poll(struct cam_sim *); @@ -1497,7 +1519,7 @@ static int32_t ocs_scsi_initiator_io_cb(ocs_io_t *io, * If we've already got a SCSI error, prefer that because it * will have more detail. */ - if ((rsp->residual < 0) && (ccb_status == CAM_REQ_CMP)) { + if ((rsp->residual < 0) && (ccb_status == CAM_REQ_CMP)) { ccb_status = CAM_DATA_RUN_ERR; } @@ -1517,7 +1539,62 @@ static int32_t ocs_scsi_initiator_io_cb(ocs_io_t *io, ocs_memcpy(&csio->sense_data, rsp->sense_data, sense_len); } } else if (scsi_status != OCS_SCSI_STATUS_GOOD) { - ccb_status = CAM_REQ_CMP_ERR; + const char *err_desc = NULL; + char path_str[64]; + char err_str[224]; + struct sbuf sb; + size_t i; + + sbuf_new(&sb, err_str, sizeof(err_str), 0); + + xpt_path_string(ccb->ccb_h.path, path_str, sizeof(path_str)); + sbuf_cat(&sb, path_str); + + for (i = 0; i < (sizeof(ocs_status_desc) / + sizeof(ocs_status_desc[0])); i++) { + if (scsi_status == ocs_status_desc[i].status) { + err_desc = ocs_status_desc[i].desc; + break; + } + } + if (ccb->ccb_h.func_code == XPT_SCSI_IO) { + scsi_command_string(&ccb->csio, &sb); + sbuf_printf(&sb, "length %d ", ccb->csio.dxfer_len); + } + sbuf_printf(&sb, "error status %d (%s)\n", scsi_status, + (err_desc != NULL) ? err_desc : "Unknown"); + sbuf_finish(&sb); + printf("%s", sbuf_data(&sb)); + + switch (scsi_status) { + case OCS_SCSI_STATUS_ABORTED: + case OCS_SCSI_STATUS_ABORT_IN_PROGRESS: + ccb_status = CAM_REQ_ABORTED; + break; + case OCS_SCSI_STATUS_DIF_GUARD_ERROR: + case OCS_SCSI_STATUS_DIF_REF_TAG_ERROR: + case OCS_SCSI_STATUS_DIF_APP_TAG_ERROR: + case OCS_SCSI_STATUS_DIF_UNKNOWN_ERROR: + case OCS_SCSI_STATUS_PROTOCOL_CRC_ERROR: + ccb_status = CAM_IDE; + break; + case OCS_SCSI_STATUS_ERROR: + case OCS_SCSI_STATUS_NO_IO: + ccb_status = CAM_REQ_CMP_ERR; + break; + case OCS_SCSI_STATUS_COMMAND_TIMEOUT: + case OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED: + ccb_status = CAM_CMD_TIMEOUT; + break; + case OCS_SCSI_STATUS_SHUTDOWN: + case OCS_SCSI_STATUS_NEXUS_LOST: + ccb_status = CAM_SCSI_IT_NEXUS_LOST; + break; + default: + ccb_status = CAM_REQ_CMP_ERR; + break; + } + } else { ccb_status = CAM_REQ_CMP; } @@ -1842,7 +1919,11 @@ ocs_initiator_io(struct ocs_softc *ocs, union ccb *ccb) } else if (ccb->ccb_h.timeout == CAM_TIME_DEFAULT) { io->timeout = OCS_CAM_IO_TIMEOUT; } else { - io->timeout = ccb->ccb_h.timeout; + if (ccb->ccb_h.timeout < 1000) + io->timeout = 1; + else { + io->timeout = ccb->ccb_h.timeout / 1000; + } } switch (csio->tag_action) { diff --git a/sys/dev/ocs_fc/ocs_hw.c b/sys/dev/ocs_fc/ocs_hw.c index 097228d12bfb..186b04b8b129 100644 --- a/sys/dev/ocs_fc/ocs_hw.c +++ b/sys/dev/ocs_fc/ocs_hw.c @@ -148,17 +148,29 @@ static void ocs_hw_check_sec_hio_list(ocs_hw_t *hw); static void target_wqe_timer_cb(void *arg); static void shutdown_target_wqe_timer(ocs_hw_t *hw); +/* WQE timeout for initiator IOs */ +static inline uint8_t +ocs_hw_set_io_wqe_timeout(ocs_hw_io_t *io, uint32_t timeout) +{ + if (timeout > 255) { + io->wqe_timeout = timeout; + return 0; + } else { + return timeout; + } +} + static inline void ocs_hw_add_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) { - if (hw->config.emulate_tgt_wqe_timeout && io->tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout && io->wqe_timeout) { /* * Active WQE list currently only used for * target WQE timeouts. */ ocs_lock(&hw->io_lock); ocs_list_add_tail(&hw->io_timed_wqe, io); - io->submit_ticks = ocs_get_os_ticks(); + getmicrouptime(&io->submit_time); ocs_unlock(&hw->io_lock); } } @@ -166,7 +178,7 @@ ocs_hw_add_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) static inline void ocs_hw_remove_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) { - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { /* * If target wqe timeouts are enabled, * remove from active wqe list. @@ -965,7 +977,7 @@ ocs_hw_init(ocs_hw_t *hw) } /* finally kick off periodic timer to check for timed out target WQEs */ - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { ocs_setup_timer(hw->os, &hw->wqe_timer, target_wqe_timer_cb, hw, OCS_HW_WQ_TIMER_PERIOD_MS); } @@ -1695,8 +1707,8 @@ ocs_hw_get(ocs_hw_t *hw, ocs_hw_property_e prop, uint32_t *value) case OCS_HW_EMULATE_I_ONLY_AAB: *value = hw->config.i_only_aab; break; - case OCS_HW_EMULATE_TARGET_WQE_TIMEOUT: - *value = hw->config.emulate_tgt_wqe_timeout; + case OCS_HW_EMULATE_WQE_TIMEOUT: + *value = hw->config.emulate_wqe_timeout; break; case OCS_HW_VPD_LEN: *value = sli_get_vpd_len(&hw->sli); @@ -1996,8 +2008,8 @@ ocs_hw_set(ocs_hw_t *hw, ocs_hw_property_e prop, uint32_t value) case OCS_HW_EMULATE_I_ONLY_AAB: hw->config.i_only_aab = value; break; - case OCS_HW_EMULATE_TARGET_WQE_TIMEOUT: - hw->config.emulate_tgt_wqe_timeout = value; + case OCS_HW_EMULATE_WQE_TIMEOUT: + hw->config.emulate_wqe_timeout = value; break; case OCS_HW_BOUNCE: hw->config.bounce = value; @@ -3324,7 +3336,7 @@ ocs_hw_init_free_io(ocs_hw_io_t *io) io->type = 0xFFFF; io->wq = NULL; io->ul_io = NULL; - io->tgt_wqe_timeout = 0; + io->wqe_timeout = 0; } /** @@ -3738,7 +3750,7 @@ ocs_hw_check_sec_hio_list(ocs_hw_t *hw) flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = io->sec_iparam.fcp_tgt.timeout; + io->wqe_timeout = io->sec_iparam.fcp_tgt.timeout; /* Complete (continue) TRECV IO */ if (io->xbusy) { @@ -4041,6 +4053,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, ocs_hw_rtn_e rc = OCS_HW_RTN_SUCCESS; uint32_t rpi; uint8_t send_wqe = TRUE; + uint8_t timeout = 0; CPUTRACE(""); @@ -4075,6 +4088,8 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, */ switch (type) { case OCS_HW_IO_INITIATOR_READ: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + /* * If use_dif_quarantine workaround is in effect, and dif_separates then mark the * initiator read IO for quarantine @@ -4090,12 +4105,14 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, if (sli_fcp_iread64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->first_data_sge, len, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, rpi, rnode, iparam->fcp_ini.dif_oper, iparam->fcp_ini.blk_size, - iparam->fcp_ini.timeout)) { + timeout)) { ocs_log_err(hw->os, "IREAD WQE error\n"); rc = OCS_HW_RTN_ERROR; } break; case OCS_HW_IO_INITIATOR_WRITE: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + ocs_hw_io_ini_sge(hw, io, iparam->fcp_ini.cmnd, iparam->fcp_ini.cmnd_size, iparam->fcp_ini.rsp); @@ -4104,18 +4121,20 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, rpi, rnode, iparam->fcp_ini.dif_oper, iparam->fcp_ini.blk_size, - iparam->fcp_ini.timeout)) { + timeout)) { ocs_log_err(hw->os, "IWRITE WQE error\n"); rc = OCS_HW_RTN_ERROR; } break; case OCS_HW_IO_INITIATOR_NODATA: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + ocs_hw_io_ini_sge(hw, io, iparam->fcp_ini.cmnd, iparam->fcp_ini.cmnd_size, iparam->fcp_ini.rsp); if (sli_fcp_icmnd64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, - rpi, rnode, iparam->fcp_ini.timeout)) { + rpi, rnode, timeout)) { ocs_log_err(hw->os, "ICMND WQE error\n"); rc = OCS_HW_RTN_ERROR; } @@ -4137,7 +4156,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; /* * If use_dif_quarantine workaround is in effect, and this is a DIF enabled IO @@ -4227,7 +4246,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; if (sli_fcp_tsend64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->first_data_sge, iparam->fcp_tgt.offset, len, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, @@ -4260,7 +4279,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, } } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; if (sli_fcp_trsp64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, len, @@ -11173,9 +11192,8 @@ target_wqe_timer_nop_cb(ocs_hw_t *hw, int32_t status, uint8_t *mqe, void *arg) { ocs_hw_io_t *io = NULL; ocs_hw_io_t *io_next = NULL; - uint64_t ticks_current = ocs_get_os_ticks(); - uint32_t sec_elapsed; ocs_hw_rtn_e rc; + struct timeval cur_time; sli4_mbox_command_header_t *hdr = (sli4_mbox_command_header_t *)mqe; @@ -11188,27 +11206,28 @@ target_wqe_timer_nop_cb(ocs_hw_t *hw, int32_t status, uint8_t *mqe, void *arg) /* loop through active WQE list and check for timeouts */ ocs_lock(&hw->io_lock); ocs_list_foreach_safe(&hw->io_timed_wqe, io, io_next) { - sec_elapsed = ((ticks_current - io->submit_ticks) / ocs_get_os_tick_freq()); /* * If elapsed time > timeout, abort it. No need to check type since * it wouldn't be on this list unless it was a target WQE */ - if (sec_elapsed > io->tgt_wqe_timeout) { - ocs_log_test(hw->os, "IO timeout xri=0x%x tag=0x%x type=%d\n", - io->indicator, io->reqtag, io->type); + getmicrouptime(&cur_time); + timevalsub(&cur_time, &io->submit_time); + if (cur_time.tv_sec > io->wqe_timeout) { + ocs_log_info(hw->os, "IO timeout xri=0x%x tag=0x%x type=%d elasped time:%u\n", + io->indicator, io->reqtag, io->type, cur_time.tv_sec); /* remove from active_wqe list so won't try to abort again */ ocs_list_remove(&hw->io_timed_wqe, io); /* save status of "timed out" for when abort completes */ io->status_saved = 1; - io->saved_status = SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT; + io->saved_status = SLI4_FC_WCQE_STATUS_WQE_TIMEOUT; io->saved_ext = 0; io->saved_len = 0; /* now abort outstanding IO */ - rc = ocs_hw_io_abort(hw, io, FALSE, NULL, NULL); + rc = ocs_hw_io_abort(hw, io, TRUE, NULL, NULL); if (rc) { ocs_log_test(hw->os, "abort failed xri=%#x tag=%#x rc=%d\n", @@ -11237,7 +11256,6 @@ target_wqe_timer_cb(void *arg) /* delete existing timer; will kick off new timer after checking wqe timeouts */ hw->in_active_wqe_timer = TRUE; - ocs_del_timer(&hw->wqe_timer); /* Forward timer callback to execute in the mailbox completion processing context */ if (ocs_hw_async_call(hw, target_wqe_timer_nop_cb, hw)) { @@ -11250,7 +11268,7 @@ shutdown_target_wqe_timer(ocs_hw_t *hw) { uint32_t iters = 100; - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { /* request active wqe timer shutdown, then wait for it to complete */ hw->active_wqe_timer_shutdown = TRUE; diff --git a/sys/dev/ocs_fc/ocs_hw.h b/sys/dev/ocs_fc/ocs_hw.h index d4ee85c3f52a..671aa40871f2 100644 --- a/sys/dev/ocs_fc/ocs_hw.h +++ b/sys/dev/ocs_fc/ocs_hw.h @@ -211,7 +211,7 @@ typedef enum { OCS_HW_WAR_VERSION, OCS_HW_DISABLE_AR_TGT_DIF, OCS_HW_EMULATE_I_ONLY_AAB, /**< emulate IAAB=0 for initiator-commands only */ - OCS_HW_EMULATE_TARGET_WQE_TIMEOUT, /**< enable driver timeouts for target WQEs */ + OCS_HW_EMULATE_WQE_TIMEOUT, /**< enable driver timeouts for WQEs */ OCS_HW_LINK_CONFIG_SPEED, OCS_HW_CONFIG_TOPOLOGY, OCS_HW_BOUNCE, @@ -520,7 +520,7 @@ typedef union ocs_hw_io_param_u { ocs_hw_dif_blk_size_e blk_size; uint32_t cmnd_size; uint16_t flags; - uint8_t timeout; + uint32_t timeout; uint32_t first_burst; } fcp_ini; } ocs_hw_io_param_t; @@ -576,8 +576,8 @@ struct ocs_hw_io_s { void *abort_arg; /**< argument passed to "abort done" callback */ ocs_ref_t ref; /**< refcount object */ size_t length; /**< needed for bug O127585: length of IO */ - uint8_t tgt_wqe_timeout; /**< timeout value for target WQEs */ - uint64_t submit_ticks; /**< timestamp when current WQE was submitted */ + uint32_t wqe_timeout; /**< timeout value for WQEs */ + struct timeval submit_time; /**< timestamp when current WQE was submitted */ uint32_t status_saved:1, /**< if TRUE, latched status should be returned */ abort_in_progress:1, /**< if TRUE, abort is in progress */ @@ -915,7 +915,7 @@ struct ocs_hw_s { uint16_t auto_xfer_rdy_app_tag_value; uint8_t dif_mode; /**< DIF mode to use */ uint8_t i_only_aab; /** Enable initiator-only auto-abort */ - uint8_t emulate_tgt_wqe_timeout; /** Enable driver target wqe timeouts */ + uint8_t emulate_wqe_timeout; /** Enable driver wqe timeouts */ uint32_t bounce:1; const char *queue_topology; /**< Queue topology string */ uint8_t auto_xfer_rdy_t10_enable; /** Enable t10 PI for auto xfer ready */ diff --git a/sys/dev/ocs_fc/ocs_scsi.c b/sys/dev/ocs_fc/ocs_scsi.c index 0e87cc0bed4b..af9fc798b01c 100644 --- a/sys/dev/ocs_fc/ocs_scsi.c +++ b/sys/dev/ocs_fc/ocs_scsi.c @@ -413,7 +413,7 @@ ocs_target_io_cb(ocs_hw_io_t *hio, ocs_remote_node_t *rnode, uint32_t length, } break; - case SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT: + case SLI4_FC_WCQE_STATUS_WQE_TIMEOUT: /* target IO timed out */ scsi_status = OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED; break; @@ -2209,6 +2209,10 @@ ocs_initiator_io_cb(ocs_hw_io_t *hio, ocs_remote_node_t *rnode, uint32_t length, scsi_status = OCS_SCSI_STATUS_ERROR; } break; + case SLI4_FC_WCQE_STATUS_WQE_TIMEOUT: + /* IO timed out */ + scsi_status = OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED; + break; case SLI4_FC_WCQE_STATUS_DI_ERROR: if (ext_status & 0x01) { scsi_status = OCS_SCSI_STATUS_DIF_GUARD_ERROR; diff --git a/sys/dev/ocs_fc/ocs_xport.c b/sys/dev/ocs_fc/ocs_xport.c index d6b0d8740906..d997ea245132 100644 --- a/sys/dev/ocs_fc/ocs_xport.c +++ b/sys/dev/ocs_fc/ocs_xport.c @@ -524,9 +524,11 @@ ocs_xport_initialize(ocs_xport_t *xport) } } - if (ocs->target_io_timer_sec) { - ocs_log_debug(ocs, "setting target io timer=%d\n", ocs->target_io_timer_sec); - ocs_hw_set(&ocs->hw, OCS_HW_EMULATE_TARGET_WQE_TIMEOUT, TRUE); + if (ocs->target_io_timer_sec || ocs->enable_ini) { + if (ocs->target_io_timer_sec) + ocs_log_debug(ocs, "setting target io timer=%d\n", ocs->target_io_timer_sec); + + ocs_hw_set(&ocs->hw, OCS_HW_EMULATE_WQE_TIMEOUT, TRUE); } ocs_hw_callback(&ocs->hw, OCS_HW_CB_DOMAIN, ocs_domain_cb, ocs); diff --git a/sys/dev/ocs_fc/sli4.h b/sys/dev/ocs_fc/sli4.h index 4d8686ce8841..e9271df1530f 100644 --- a/sys/dev/ocs_fc/sli4.h +++ b/sys/dev/ocs_fc/sli4.h @@ -5345,7 +5345,7 @@ typedef struct sli4_fc_wqec_s { #define SLI4_FC_WCQE_STATUS_RX_ABORT_REQUEST 0x1b /* driver generated status codes; better not overlap with chip's status codes! */ -#define SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT 0xff +#define SLI4_FC_WCQE_STATUS_WQE_TIMEOUT 0xff #define SLI4_FC_WCQE_STATUS_SHUTDOWN 0xfe #define SLI4_FC_WCQE_STATUS_DISPATCH_ERROR 0xfd