From nobody Wed Nov 29 01:55:37 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Sg2Tj2txlz52t9N; Wed, 29 Nov 2023 01:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Sg2Tj2Qx2z4DZF; Wed, 29 Nov 2023 01:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701222937; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1UQHXWqG7uXgfnDNcLcy4lj0pg4C3ugVkvrbTJ9fB4g=; b=uh/zVMnqNuTf0HE2Gn2RQ+VMVlh1ddyyud9Y/YhO/l815G4Xl3SXxpY42k1gfcspyOFDpM PrGebj7p3gChmoeWrz438XN3g2VCY7XpoiAEjzyePNMNQi77p7RXrxqxLdOJbQZbkCyNAH jJ/BaUq16+NuXqCoVTrypvUrxtXD7z3fWeuYhSc48OrGZf5MMdN1RV87sxO0p+iRgWLYWQ ck5SbnynH5G/1FnLhPjHz4EJrFCVd6qVwIuCl1ezJzdn9G9W/z4vUkomvnRX3rLoZ6/60T cXTOp12XzZidtevGOLXzRmhg6Yz+gWapJjQ/aJ/HegaHlZDehwCOt+xKsSmWLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1701222937; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1UQHXWqG7uXgfnDNcLcy4lj0pg4C3ugVkvrbTJ9fB4g=; b=wsPGVmCElBlC2TA/lyw4/CGWYHgRGNlHaHjGYIDZIs8l7JvrOFaVRpMrqo+l9F2FJcqU9W IfeVIso5C69Kcqlz1st3KLI/tXZK430lCuwT6urfOepvv3n4SsTP+zRqSvdHW0ibMPyKLi a6EUEJ3BFcKrWI1UobMVyjuMnR6fCwtki4JmM0D6XU7Y6m7n6XoYZtAPxZ36cR4vsDMJSq 93VIROPeU7Sa9+Vgg1KNbG8N6Cq/f7/McEJPWptwbkf5qvsYMf2na76PTsq0Md1h5ivLAp POKp0blS9rb1mOeJdD7B7ivbgFO5hbpq8sKD46jPcchOF+XYdGh/EJLMtNHSXQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1701222937; a=rsa-sha256; cv=none; b=BsdQmvNiH8cfMw+Ot0P0O+7ZkZaGYsQ/DpRcWMpaG/E3/otbM6dMNLFqVDha+Q5I4Mc4S2 gZ5o4cyZgKyb8DmS0P9TwqHMpLiIn7JY4eY83VEdSiEnL0RmOYbRzhtG0w2xKywtyUwwyI LUTQsH1F7MLaxmG4icwMAx6NtY6hZcYYhoRb5ljqX0ioMncs1tksQtbTfgoQ2FYQlkUDQP AeWd+yKKd6J5yx4nktocYKvk0ORW5rpgaSKgSUTCB9cHRpndczcVnzVk7fvoluXPbjTumt Gx5IpazdM9sRE204pWDugDfddC2f4FGTq/lnqmct4TOiGIHD97jaH/GPJxL70A== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Sg2Tj1XJHzn29; Wed, 29 Nov 2023 01:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3AT1tbI9064468; Wed, 29 Nov 2023 01:55:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3AT1tbmi064465; Wed, 29 Nov 2023 01:55:37 GMT (envelope-from git) Date: Wed, 29 Nov 2023 01:55:37 GMT Message-Id: <202311290155.3AT1tbmi064465@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Warner Losh Subject: git: 3208a189c1e2 - main - mpi3mr: Fix EINPROGRESS errors hanging the card List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: imp X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 3208a189c1e2c4ef35daa432fe45629a043d7047 Auto-Submitted: auto-generated The branch main has been updated by imp: URL: https://cgit.FreeBSD.org/src/commit/?id=3208a189c1e2c4ef35daa432fe45629a043d7047 commit 3208a189c1e2c4ef35daa432fe45629a043d7047 Author: Warner Losh AuthorDate: 2023-11-29 01:49:39 +0000 Commit: Warner Losh CommitDate: 2023-11-29 01:49:39 +0000 mpi3mr: Fix EINPROGRESS errors hanging the card Move enqueueing of commands to bus_dmamap_load_ccb callback Fix fundamental difference between FreeBSD and Linux. On Linux, your dma load callback always happends before it returns, so drivers are written to load the map, then submit to hardware. On FreeBSD, the callback may be deferred and return EINPROGRESS. This means the callback is responsible for queueing the request to the hardware is done after the SGL list is created. Make a number of interrelated cahnages: At the end of mpi3mr_prepare_sgls, add a call to mpi3mr_enqueue_request. Split the hardware submission out from the end of mpi3mr_action_scsiio and move it into a new routine mpi3mr_enqueue_request. Move all error completion from the end of mpi3mr_action_scsiio to where the error is detected. We cannot pass errors back from the mpi3mr_enqueue_request to do this on a 'failed' mpi3mr in a centralized place (since it has to be fire and forget). Add comments about zero length SGLs never making it into mpi3mr_prepare_sgls. Keep the code there for the moment, but we only set cm->data to non-NULL when scsiio_req->DataLength is not zero. So the datalength can't be zero and we can't send the zero SGLs. Add commentts about other "impossible" tests in mpi3mr_prepare_sgls that really should be simple asserts of some flavor. Eliminate cm->error_code, since we can't pass data back from the mpi3mr_prepare_sgl callback anymore. In mpi3mr_map_request, call mpi3mr_enqueue_request for the no data case. This seems to work even though we've not done the special zero length handling that was in mpi3mr_prepare_sgls, giving further evidence to it not actually being needed. This is needed for SCSI CDBs that have no data to pass to the drive like TEST UNIT READY. With this change, and the prior ones, we're now able to run with mpi3mr on 128GB systems and very heavy disk load (so many buffers land > 4GB: the driver instructs busdma to never use memory abouve 4GB, which may be too conservative, but an issue for another time). Sponsored by: Netflix Reviewed by: sumit.saxena_broadcom.com, mav, jhb Differential Revision: https://reviews.freebsd.org/D42543 --- sys/dev/mpi3mr/mpi3mr.h | 1 - sys/dev/mpi3mr/mpi3mr_cam.c | 130 +++++++++++++++++++++++++------------------- 2 files changed, 73 insertions(+), 58 deletions(-) diff --git a/sys/dev/mpi3mr/mpi3mr.h b/sys/dev/mpi3mr/mpi3mr.h index b00d2ef562d8..bf23f0d1196c 100644 --- a/sys/dev/mpi3mr/mpi3mr.h +++ b/sys/dev/mpi3mr/mpi3mr.h @@ -466,7 +466,6 @@ struct mpi3mr_cmd { U16 hosttag; U8 req_qidx; Mpi3SCSIIORequest_t io_request; - int error_code; }; struct mpi3mr_chain { diff --git a/sys/dev/mpi3mr/mpi3mr_cam.c b/sys/dev/mpi3mr/mpi3mr_cam.c index 4a525a222577..d16826ef1c16 100644 --- a/sys/dev/mpi3mr/mpi3mr_cam.c +++ b/sys/dev/mpi3mr/mpi3mr_cam.c @@ -85,7 +85,9 @@ #define smp_processor_id() PCPU_GET(cpuid) -static int +static void +mpi3mr_enqueue_request(struct mpi3mr_softc *sc, struct mpi3mr_cmd *cm); +static void mpi3mr_map_request(struct mpi3mr_softc *sc, struct mpi3mr_cmd *cm); void mpi3mr_release_simq_reinit(struct mpi3mr_cam_softc *cam_sc); @@ -117,18 +119,23 @@ static void mpi3mr_prepare_sgls(void *arg, U8 last_chain_sgl_flags; struct mpi3mr_chain *chain_req; Mpi3SCSIIORequest_t *scsiio_req; + union ccb *ccb; cm = (struct mpi3mr_cmd *)arg; sc = cm->sc; scsiio_req = (Mpi3SCSIIORequest_t *) &cm->io_request; + ccb = cm->ccb; if (error) { - cm->error_code = error; device_printf(sc->mpi3mr_dev, "%s: error=%d\n",__func__, error); if (error == EFBIG) { - cm->ccb->ccb_h.status = CAM_REQ_TOO_BIG; - return; + mpi3mr_set_ccbstatus(ccb, CAM_REQ_TOO_BIG); + } else { + mpi3mr_set_ccbstatus(ccb, CAM_REQ_CMP_ERR); } + mpi3mr_release_command(cm); + xpt_done(ccb); + return; } if (cm->data_dir == MPI3MR_READ) @@ -137,10 +144,9 @@ static void mpi3mr_prepare_sgls(void *arg, if (cm->data_dir == MPI3MR_WRITE) bus_dmamap_sync(sc->buffer_dmat, cm->dmamap, BUS_DMASYNC_PREWRITE); - if (nsegs > MPI3MR_SG_DEPTH) { - device_printf(sc->mpi3mr_dev, "SGE count is too large or 0.\n"); - return; - } + + KASSERT(nsegs <= MPI3MR_SG_DEPTH && nsegs > 0, + ("%s: bad SGE count: %d\n", device_get_nameunit(sc->mpi3mr_dev), nsegs)); simple_sgl_flags = MPI3_SGE_FLAGS_ELEMENT_TYPE_SIMPLE | MPI3_SGE_FLAGS_DLAS_SYSTEM; @@ -151,24 +157,15 @@ static void mpi3mr_prepare_sgls(void *arg, sg_local = (U8 *)&scsiio_req->SGL; - if (!scsiio_req->DataLength) { + if (scsiio_req->DataLength == 0) { + /* XXX we don't ever get here when DataLength == 0, right? cm->data is NULL */ + /* This whole if can likely be removed -- we handle it in mpi3mr_request_map */ mpi3mr_build_zero_len_sge(sg_local); - return; + goto enqueue; } sges_left = nsegs; - if (sges_left < 0) { - printf("scsi_dma_map failed: request for %d bytes!\n", - scsiio_req->DataLength); - return; - } - if (sges_left > MPI3MR_SG_DEPTH) { - printf("scsi_dma_map returned unsupported sge count %d!\n", - sges_left); - return; - } - sges_in_segment = (sc->facts.op_req_sz - offsetof(Mpi3SCSIIORequest_t, SGL))/sizeof(Mpi3SGESimple_t); @@ -217,33 +214,51 @@ fill_in_last_segment: i++; } +enqueue: + /* + * Now that we've created the sgls, we send the request to the device. + * Unlike in Linux, dmaload isn't guaranteed to load every time, but + * this function is always called when the resources are available, so + * we can send the request to hardware here always. mpi3mr_map_request + * knows about this quirk and will only take evasive action when an + * error other than EINPROGRESS is returned from dmaload. + */ + mpi3mr_enqueue_request(sc, cm); + return; } -int +static void mpi3mr_map_request(struct mpi3mr_softc *sc, struct mpi3mr_cmd *cm) { u_int32_t retcode = 0; + union ccb *ccb; + ccb = cm->ccb; if (cm->data != NULL) { mtx_lock(&sc->io_lock); /* Map data buffer into bus space */ retcode = bus_dmamap_load_ccb(sc->buffer_dmat, cm->dmamap, - cm->ccb, mpi3mr_prepare_sgls, cm, 0); + ccb, mpi3mr_prepare_sgls, cm, 0); mtx_unlock(&sc->io_lock); - if (retcode) - device_printf(sc->mpi3mr_dev, "bus_dmamap_load(): retcode = %d\n", retcode); - if (retcode == EINPROGRESS) { - device_printf(sc->mpi3mr_dev, "request load in progress\n"); - xpt_freeze_simq(sc->cam_sc->sim, 1); + if (retcode != 0 && retcode != EINPROGRESS) { + device_printf(sc->mpi3mr_dev, + "bus_dmamap_load(): retcode = %d\n", retcode); + /* + * Any other error means prepare_sgls wasn't called, and + * will never be called, so we have to mop up. This error + * should never happen, though. + */ + mpi3mr_set_ccbstatus(ccb, CAM_REQ_CMP_ERR); + mpi3mr_release_command(cm); + xpt_done(ccb); } + } else { + /* + * No data, we enqueue it directly here. + */ + mpi3mr_enqueue_request(sc, cm); } - if (cm->error_code) - return cm->error_code; - if (retcode) - mpi3mr_set_ccbstatus(cm->ccb, CAM_REQ_INVALID); - - return (retcode); } void @@ -911,12 +926,6 @@ mpi3mr_action_scsiio(struct mpi3mr_cam_softc *cam_sc, union ccb *ccb) struct mpi3mr_cmd *cm; uint8_t scsi_opcode, queue_idx; uint32_t mpi_control; - struct mpi3mr_op_req_queue *opreqq = NULL; - U32 data_len_blks = 0; - U32 tracked_io_sz = 0; - U32 ioc_pend_data_len = 0, tg_pend_data_len = 0; - struct mpi3mr_throttle_group_info *tg = NULL; - static int ratelimit; sc = cam_sc->sc; mtx_assert(&sc->mpi3mr_mtx, MA_OWNED); @@ -1103,15 +1112,15 @@ mpi3mr_action_scsiio(struct mpi3mr_cam_softc *cam_sc, union ccb *ccb) case CAM_DATA_SG_PADDR: device_printf(sc->mpi3mr_dev, "%s: physical addresses not supported\n", __func__); - mpi3mr_release_command(cm); mpi3mr_set_ccbstatus(ccb, CAM_REQ_INVALID); + mpi3mr_release_command(cm); xpt_done(ccb); return; case CAM_DATA_SG: device_printf(sc->mpi3mr_dev, "%s: scatter gather is not supported\n", __func__); - mpi3mr_release_command(cm); mpi3mr_set_ccbstatus(ccb, CAM_REQ_INVALID); + mpi3mr_release_command(cm); xpt_done(ccb); return; case CAM_DATA_VADDR: @@ -1128,27 +1137,35 @@ mpi3mr_action_scsiio(struct mpi3mr_cam_softc *cam_sc, union ccb *ccb) cm->data = csio->data_ptr; break; default: - mpi3mr_release_command(cm); mpi3mr_set_ccbstatus(ccb, CAM_REQ_INVALID); - xpt_done(ccb); - return; - } - - /* Prepare SGEs */ - if (mpi3mr_map_request(sc, cm)) { mpi3mr_release_command(cm); xpt_done(ccb); - printf("func: %s line: %d Build SGLs failed\n", __func__, __LINE__); return; } - - opreqq = &sc->op_req_q[queue_idx]; + + /* Prepare SGEs and queue to hardware */ + mpi3mr_map_request(sc, cm); +} + +static void +mpi3mr_enqueue_request(struct mpi3mr_softc *sc, struct mpi3mr_cmd *cm) +{ + static int ratelimit; + struct mpi3mr_op_req_queue *opreqq = &sc->op_req_q[cm->req_qidx]; + struct mpi3mr_throttle_group_info *tg = NULL; + uint32_t data_len_blks = 0; + uint32_t tracked_io_sz = 0; + uint32_t ioc_pend_data_len = 0, tg_pend_data_len = 0; + struct mpi3mr_target *targ = cm->targ; + union ccb *ccb = cm->ccb; + Mpi3SCSIIORequest_t *req = (Mpi3SCSIIORequest_t *)&cm->io_request; if (sc->iot_enable) { - data_len_blks = csio->dxfer_len >> 9; - + data_len_blks = ccb->csio.dxfer_len >> 9; + if ((data_len_blks >= sc->io_throttle_data_length) && targ->io_throttle_enabled) { + tracked_io_sz = data_len_blks; tg = targ->throttle_group; if (tg) { @@ -1206,19 +1223,18 @@ mpi3mr_action_scsiio(struct mpi3mr_cam_softc *cam_sc, union ccb *ccb) if (targ->io_divert) { req->MsgFlags |= MPI3_SCSIIO_MSGFLAGS_DIVERT_TO_FIRMWARE; - mpi_control |= MPI3_SCSIIO_FLAGS_DIVERT_REASON_IO_THROTTLING; + req->Flags = htole32(le32toh(req->Flags) | MPI3_SCSIIO_FLAGS_DIVERT_REASON_IO_THROTTLING); } } - req->Flags = htole32(mpi_control); if (mpi3mr_submit_io(sc, opreqq, (U8 *)&cm->io_request)) { - mpi3mr_release_command(cm); if (tracked_io_sz) { mpi3mr_atomic_sub(&sc->pend_large_data_sz, tracked_io_sz); if (tg) mpi3mr_atomic_sub(&tg->pend_large_data_sz, tracked_io_sz); } mpi3mr_set_ccbstatus(ccb, CAM_RESRC_UNAVAIL); + mpi3mr_release_command(cm); xpt_done(ccb); } else { callout_reset_sbt(&cm->callout, mstosbt(ccb->ccb_h.timeout), 0,