mfi panic on recused on non-recusive mutex MFI I/O lock
Steven Hartland
killing at multiplay.co.uk
Tue Nov 6 00:09:49 UTC 2012
Thanks Doug, actually just finished another test run with some more
debugging in and I believe I've found the reason for the non-recusive
lock and at least some of the queuing issues.
The non-recursive lock is due to the mfi_tbolt_reset calling
mfi_process_fw_state_chg_isr with mfi_io_lock held which in turn calls
mfi_tbolt_init_MFI_queue which tries to acquire mfi_io_lock hence
the problem.
mfi-lock.txt attached I believe fixes this as well as what appears
to be an invalid call to mtx_unlock(&sc->mfi_io_lock) in mfi_attach
which never acquires the lock as far as can see, possibly a cut and
paste error.
The invalid queue problems seem to stem from the error cases of
the calls to mfi_mapcmd, some of which call mfi_release_command which
blindly sets cm_flags = 0 and then enqueues it on the free queue. Now
depending on the flow of mfi_mapcmd and where the error occurs the
command may or may not have been put on the busy queue which is going
to cause problems.
Going to investigate this further but that's what my current theory is.
Your patch seems quite extensive, so if could you give me brief run
down on the changes that would be most appreciated.
FYI, I'm aware that the cause of my underlying issues are some
hardware issues (likely cable or backplane related) but it does mean
I'm in the position to test these usually rare error cases, so wanting
the make the most of it before we get the hardware swapped out.
Regards
Steve
----- Original Message -----
From: "Doug Ambrisko" <ambrisko at ambrisko.com>
To: "Steven Hartland" <killing at multiplay.co.uk>
Cc: <freebsd-stable at freebsd.org>; <freebsd-scsi at freebsd.org>
Sent: Monday, November 05, 2012 9:29 PM
Subject: Re: mfi panic on recused on non-recusive mutex MFI I/O lock
> On Mon, Nov 05, 2012 at 04:55:11PM -0000, Steven Hartland wrote:
> | I've managed to get the machine to reproduce this fairly regularly
> | now.
> |
> | Without a debug kernel it still results in a panic, just at a later
> | stage or so I believe, the none debug panic messages is "command not
> | in queue".
> |
> | In each none debug panic I've seen the cm_flags indicates the
> | command being dequeued is on the busy queue and not on the expected
> | free or ready queue which is being processed at the time.
> |
> | The triggering issue seems to be the adapter reset code run from
> | mfi_timeout.
> |
> | I've had a good look but can't see how a cm could be in a queue yet
> | have its cm_flags set to that of a different queue as all manipulation
> | seems to be being done via the "mfi_<method> ## name" macros which
> | all correctly maintain the queue / cm_flags relationship.
> |
> | At this point I believe it could be a thread being interrupted by
> | a timeout part way the processing of a queue request hence queue
> | and cm_flags being out of sync.
> |
> | Any pointers on how to debug this issue further / fix it would be most
> | appreciated.
> |
> | Regards
> | Steve
> |
> | ----- Original Message -----
> | From: "Steven Hartland"
> | >Testing a new machine which is based on 8.3-RELEASE with the mfi
> | >driver from 8-STABLE and just got a panic.
> | >
> | >
> | >The below is translation of the hand copied from console:-
> | >mfi0: sense error 0, sense_key 0, asc 0, ascq 0
> | >mfisyspd5: hard error cmd=write 90827650-90827905
> | >mfi0: I/O error, status= 46 scsi_status= 240
> | >mfi0: sense error 0, sense_key 0, asc 0, ascq 0
> | >mfisyspd5: hard error cmd=write 90827394-90827649
> | >mfi0: I/O error, status= 46 scsi_status= 240
> | >mfi0: sense error 0, sense_key 0, asc 0, ascq 0
> | >mfisyspd5: hard error cmd=write 90827138-90827393
> | >mfi0: I/O error, status= 46 scsi_status= 240
> | >mfi0: sense error 0, sense_key 0, asc 0, ascq 0
> | >mfisyspd5: hard error cmd=write 90826882-90827137
> | >mfi0: I/O error, status= 2 scsi_status= 2
> | >mfi0: sense error 112, sense_key 6, asc 41, ascq 0
> | >mfisyspd4: hard error cmd=write 90830466-90830721
> | >mfi0: I/O error, status= 2 scsi_status= 2
> | >mfi0: sense error 112, sense_key 6, asc 41, ascq 0
> | >mfisyspd5: hard error cmd=write 90830722-90830977
> | >mfi0: Adapter RESET condition detected
> | >mfi0: First state FW reset initiated...
> | >mfi0: ADP_RESET_TBOLT: HostDiag=a0
> | >mfi0: first state of reset complete, second state initiated...
> | >mfi0: Second state FW reset initiated...
> | >panic: _mtx_lock_sleep: recursed on non-recusive mutex MFI I/O lock @
> | >/usr/src/sys/dev/mfi/mfi_tbolt:346
> | >
> | >cpuid = 6
> | >KDB: stack backtrace:
> | >db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> | >kdb_backtrace() at kdb_backtrace+0x37
> | >panic() at panic+0x178
> | >_mtx_lock_sleep() at _mtx_lock_sleep+0x152
> | >_mtx_lock_flags() at _mtx_lock_flags+0x80
> | >mfi_tbolt_init_MFI_queue() at mfi_tbolt_init_MFI_queue+0x72
> | >mfi_timeout() at mfi_timeout+0x27
> | >softclock() at softclock+0x2aa
> | >intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> | >ithread_loop() at ithread_loop+0xb2
> | >fork_exit() at fork_exit+0x135
> | >fork_trampoline() at fork_trampoline+0xe
> | >--- trap 0, rip = 0, rsp = 0xffffff80005ccd00, rbp = 0 ---
> | >KDB: enter panic
> | >[thread pid 12 tid 100020 ]
> | >Stopperd at kdb_enter+0x3b: movq $0,0x51cb32(%rip)
> | >db>
> | >
> | >So questions:-
> | >1. What are the "hard error" errors? The machine was testing IO
> | >with dd but due to the panic I cant tell if that was the cause.
> | >2. Looking at the code this seems like the reset was tripped by
> | >firmware bug, is that the case?
> | >3. Is the fix the panic a simple one we cat test?
>
> As soon as I get caught up on email, I'll be checking in some updates
> that fix a bunch of issues. I don't know if it will fix this but
> it could help. At any case it would get us to a common base line to
> look at debugging this. The patch is attached and follows. It is
> relative to -head. It should be easy to head and this to stable.
>
> Index: mfi_tbolt.c
> ===================================================================
> --- mfi_tbolt.c (revision 242617)
> +++ mfi_tbolt.c (working copy)
> @@ -69,13 +69,10 @@
> mfi_build_mpt_pass_thru(struct mfi_softc *sc, struct mfi_command *mfi_cmd);
> union mfi_mpi2_request_descriptor *mfi_build_and_issue_cmd(struct mfi_softc
> *sc, struct mfi_command *mfi_cmd);
> -int mfi_tbolt_is_ldio(struct mfi_command *mfi_cmd);
> void mfi_tbolt_build_ldio(struct mfi_softc *sc, struct mfi_command *mfi_cmd,
> struct mfi_cmd_tbolt *cmd);
> static int mfi_tbolt_make_sgl(struct mfi_softc *sc, struct mfi_command
> *mfi_cmd, pMpi25IeeeSgeChain64_t sgl_ptr, struct mfi_cmd_tbolt *cmd);
> -static int mfi_tbolt_build_cdb(struct mfi_softc *sc, struct mfi_command
> - *mfi_cmd, uint8_t *cdb);
> void
> map_tbolt_cmd_status(struct mfi_command *mfi_cmd, uint8_t status,
> uint8_t ext_status);
> @@ -502,6 +499,7 @@
> + i * MEGASAS_MAX_SZ_CHAIN_FRAME);
> cmd->sg_frame_phys_addr = sc->sg_frame_busaddr + i
> * MEGASAS_MAX_SZ_CHAIN_FRAME;
> + cmd->sync_cmd_idx = sc->mfi_max_fw_cmds;
>
> TAILQ_INSERT_TAIL(&(sc->mfi_cmd_tbolt_tqh), cmd, next);
> }
> @@ -608,6 +606,8 @@
> }
> }
>
> +int outstanding = 0;
> +
> /*
> * mfi_tbolt_return_cmd - Return a cmd to free command pool
> * @instance: Adapter soft state
> @@ -618,7 +618,9 @@
> {
> mtx_assert(&sc->mfi_io_lock, MA_OWNED);
>
> + cmd->sync_cmd_idx = sc->mfi_max_fw_cmds;
> TAILQ_INSERT_TAIL(&sc->mfi_cmd_tbolt_tqh, cmd, next);
> + outstanding--;
> }
>
> void
> @@ -667,16 +669,27 @@
> extStatus = cmd_mfi->cm_frame->dcmd.header.scsi_status;
> map_tbolt_cmd_status(cmd_mfi, status, extStatus);
>
> - /* remove command from busy queue if not polled */
> - TAILQ_FOREACH(cmd_mfi_check, &sc->mfi_busy, cm_link) {
> - if (cmd_mfi_check == cmd_mfi) {
> - mfi_remove_busy(cmd_mfi);
> - break;
> + if (cmd_mfi->cm_flags & MFI_CMD_SCSI &&
> + (cmd_mfi->cm_flags & MFI_CMD_POLLED) != 0) {
> + /* polled LD/SYSPD IO command */
> + cmd_mfi->cm_error = 0;
> + cmd_mfi->cm_frame->header.cmd_status = 0;
> + mfi_tbolt_return_cmd(sc, cmd_tbolt);
> + } else {
> +
> + /* remove command from busy queue if not polled */
> + TAILQ_FOREACH(cmd_mfi_check, &sc->mfi_busy, cm_link) {
> + if (cmd_mfi_check == cmd_mfi) {
> + mfi_remove_busy(cmd_mfi);
> + break;
> + }
> }
> +
> + /* complete the command */
> + cmd_mfi->cm_error = 0;
> + mfi_complete(sc, cmd_mfi);
> + mfi_tbolt_return_cmd(sc, cmd_tbolt);
> }
> - cmd_mfi->cm_error = 0;
> - mfi_complete(sc, cmd_mfi);
> - mfi_tbolt_return_cmd(sc, cmd_tbolt);
>
> sc->last_reply_idx++;
> if (sc->last_reply_idx >= sc->mfi_max_fw_cmds) {
> @@ -746,6 +759,7 @@
> p = sc->request_desc_pool + sizeof(union mfi_mpi2_request_descriptor)
> * index;
> memset(p, 0, sizeof(union mfi_mpi2_request_descriptor));
> + outstanding++;
> return (union mfi_mpi2_request_descriptor *)p;
> }
>
> @@ -811,13 +825,13 @@
> MFI_FRAME_DIR_READ)
> io_info.isRead = 1;
>
> - io_request->RaidContext.timeoutValue
> - = MFI_FUSION_FP_DEFAULT_TIMEOUT;
> - io_request->Function = MPI2_FUNCTION_LD_IO_REQUEST;
> - io_request->DevHandle = device_id;
> - cmd->request_desc->header.RequestFlags
> - = (MFI_REQ_DESCRIPT_FLAGS_LD_IO
> - << MFI_REQ_DESCRIPT_FLAGS_TYPE_SHIFT);
> + io_request->RaidContext.timeoutValue
> + = MFI_FUSION_FP_DEFAULT_TIMEOUT;
> + io_request->Function = MPI2_FUNCTION_LD_IO_REQUEST;
> + io_request->DevHandle = device_id;
> + cmd->request_desc->header.RequestFlags
> + = (MFI_REQ_DESCRIPT_FLAGS_LD_IO
> + << MFI_REQ_DESCRIPT_FLAGS_TYPE_SHIFT);
> if ((io_request->IoFlags == 6) && (io_info.numBlocks == 0))
> io_request->RaidContext.RegLockLength = 0x100;
> io_request->DataLength = mfi_cmd->cm_frame->io.header.data_len
> @@ -825,41 +839,37 @@
> }
>
> int
> -mfi_tbolt_is_ldio(struct mfi_command *mfi_cmd)
> -{
> - if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_READ
> - || mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> - return 1;
> - else
> - return 0;
> -}
> -
> -int
> mfi_tbolt_build_io(struct mfi_softc *sc, struct mfi_command *mfi_cmd,
> struct mfi_cmd_tbolt *cmd)
> {
> - uint32_t device_id;
> + struct mfi_mpi2_request_raid_scsi_io *io_request;
> uint32_t sge_count;
> - uint8_t cdb[32], cdb_len;
> + uint8_t cdb_len;
> + int readop;
> + u_int64_t lba;
>
> - memset(cdb, 0, 32);
> - struct mfi_mpi2_request_raid_scsi_io *io_request = cmd->io_request;
> -
> - device_id = mfi_cmd->cm_frame->header.target_id;
> -
> - /* Have to build CDB here for TB as BSD don't have a scsi layer */
> - if ((cdb_len = mfi_tbolt_build_cdb(sc, mfi_cmd, cdb)) == 1)
> + io_request = cmd->io_request;
> + if (!(mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_READ
> + || mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE))
> return 1;
>
> - /* Just the CDB length,rest of the Flags are zero */
> - io_request->IoFlags = cdb_len;
> - memcpy(io_request->CDB.CDB32, cdb, 32);
> + mfi_tbolt_build_ldio(sc, mfi_cmd, cmd);
>
> - if (mfi_tbolt_is_ldio(mfi_cmd))
> - mfi_tbolt_build_ldio(sc, mfi_cmd , cmd);
> + /* Convert to SCSI command CDB */
> + bzero(io_request->CDB.CDB32, sizeof(io_request->CDB.CDB32));
> + if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> + readop = 0;
> else
> - return 1;
> + readop = 1;
>
> + lba = mfi_cmd->cm_frame->io.lba_hi;
> + lba = (lba << 32) + mfi_cmd->cm_frame->io.lba_lo;
> + cdb_len = mfi_build_cdb(readop, 0, lba,
> + mfi_cmd->cm_frame->io.header.data_len, io_request->CDB.CDB32);
> +
> + /* Just the CDB length, rest of the Flags are zero */
> + io_request->IoFlags = cdb_len;
> +
> /*
> * Construct SGL
> */
> @@ -883,85 +893,13 @@
>
> io_request->SenseBufferLowAddress = mfi_cmd->cm_sense_busaddr;
> io_request->SenseBufferLength = MFI_SENSE_LEN;
> + io_request->RaidContext.Status = MFI_STAT_INVALID_STATUS;
> + io_request->RaidContext.exStatus = MFI_STAT_INVALID_STATUS;
> +
> return 0;
> }
>
> -static int
> -mfi_tbolt_build_cdb(struct mfi_softc *sc, struct mfi_command *mfi_cmd,
> - uint8_t *cdb)
> -{
> - uint32_t lba_lo, lba_hi, num_lba;
> - uint8_t cdb_len;
>
> - if (mfi_cmd == NULL || cdb == NULL)
> - return 1;
> - num_lba = mfi_cmd->cm_frame->io.header.data_len;
> - lba_lo = mfi_cmd->cm_frame->io.lba_lo;
> - lba_hi = mfi_cmd->cm_frame->io.lba_hi;
> -
> - if (lba_hi == 0 && (num_lba <= 0xFF) && (lba_lo <= 0x1FFFFF)) {
> - if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> - /* Read 6 or Write 6 */
> - cdb[0] = (uint8_t) (0x0A);
> - else
> - cdb[0] = (uint8_t) (0x08);
> -
> - cdb[4] = (uint8_t) num_lba;
> - cdb[3] = (uint8_t) (lba_lo & 0xFF);
> - cdb[2] = (uint8_t) (lba_lo >> 8);
> - cdb[1] = (uint8_t) ((lba_lo >> 16) & 0x1F);
> - cdb_len = 6;
> - }
> - else if (lba_hi == 0 && (num_lba <= 0xFFFF) && (lba_lo <= 0xFFFFFFFF)) {
> - if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> - /* Read 10 or Write 10 */
> - cdb[0] = (uint8_t) (0x2A);
> - else
> - cdb[0] = (uint8_t) (0x28);
> - cdb[8] = (uint8_t) (num_lba & 0xFF);
> - cdb[7] = (uint8_t) (num_lba >> 8);
> - cdb[5] = (uint8_t) (lba_lo & 0xFF);
> - cdb[4] = (uint8_t) (lba_lo >> 8);
> - cdb[3] = (uint8_t) (lba_lo >> 16);
> - cdb[2] = (uint8_t) (lba_lo >> 24);
> - cdb_len = 10;
> - } else if ((num_lba > 0xFFFF) && (lba_hi == 0)) {
> - if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> - /* Read 12 or Write 12 */
> - cdb[0] = (uint8_t) (0xAA);
> - else
> - cdb[0] = (uint8_t) (0xA8);
> - cdb[9] = (uint8_t) (num_lba & 0xFF);
> - cdb[8] = (uint8_t) (num_lba >> 8);
> - cdb[7] = (uint8_t) (num_lba >> 16);
> - cdb[6] = (uint8_t) (num_lba >> 24);
> - cdb[5] = (uint8_t) (lba_lo & 0xFF);
> - cdb[4] = (uint8_t) (lba_lo >> 8);
> - cdb[3] = (uint8_t) (lba_lo >> 16);
> - cdb[2] = (uint8_t) (lba_lo >> 24);
> - cdb_len = 12;
> - } else {
> - if (mfi_cmd->cm_frame->header.cmd == MFI_CMD_LD_WRITE)
> - cdb[0] = (uint8_t) (0x8A);
> - else
> - cdb[0] = (uint8_t) (0x88);
> - cdb[13] = (uint8_t) (num_lba & 0xFF);
> - cdb[12] = (uint8_t) (num_lba >> 8);
> - cdb[11] = (uint8_t) (num_lba >> 16);
> - cdb[10] = (uint8_t) (num_lba >> 24);
> - cdb[9] = (uint8_t) (lba_lo & 0xFF);
> - cdb[8] = (uint8_t) (lba_lo >> 8);
> - cdb[7] = (uint8_t) (lba_lo >> 16);
> - cdb[6] = (uint8_t) (lba_lo >> 24);
> - cdb[5] = (uint8_t) (lba_hi & 0xFF);
> - cdb[4] = (uint8_t) (lba_hi >> 8);
> - cdb[3] = (uint8_t) (lba_hi >> 16);
> - cdb[2] = (uint8_t) (lba_hi >> 24);
> - cdb_len = 16;
> - }
> - return cdb_len;
> -}
> -
> static int
> mfi_tbolt_make_sgl(struct mfi_softc *sc, struct mfi_command *mfi_cmd,
> pMpi25IeeeSgeChain64_t sgl_ptr, struct mfi_cmd_tbolt *cmd)
> @@ -1100,8 +1038,7 @@
> if ((cm->cm_flags & MFI_CMD_POLLED) == 0) {
> cm->cm_timestamp = time_uptime;
> mfi_enqueue_busy(cm);
> - }
> - else { /* still get interrupts for it */
> + } else { /* still get interrupts for it */
> hdr->cmd_status = MFI_STAT_INVALID_STATUS;
> hdr->flags |= MFI_FRAME_DONT_POST_IN_REPLY_QUEUE;
> }
> @@ -1118,31 +1055,49 @@
> }
> else
> device_printf(sc->mfi_dev, "DJA NA XXX SYSPDIO\n");
> - }
> - else if (hdr->cmd == MFI_CMD_LD_SCSI_IO ||
> + } else if (hdr->cmd == MFI_CMD_LD_SCSI_IO ||
> hdr->cmd == MFI_CMD_LD_READ || hdr->cmd == MFI_CMD_LD_WRITE) {
> + cm->cm_flags |= MFI_CMD_SCSI;
> if ((req_desc = mfi_build_and_issue_cmd(sc, cm)) == NULL) {
> device_printf(sc->mfi_dev, "LDIO Failed \n");
> return 1;
> }
> - } else
> - if ((req_desc = mfi_tbolt_build_mpt_cmd(sc, cm)) == NULL) {
> + } else if ((req_desc = mfi_tbolt_build_mpt_cmd(sc, cm)) == NULL) {
> device_printf(sc->mfi_dev, "Mapping from MFI to MPT "
> "Failed\n");
> return 1;
> - }
> + }
> +
> + if (cm->cm_flags & MFI_CMD_SCSI) {
> + /*
> + * LD IO needs to be posted since it doesn't get
> + * acknowledged via a status update so have the
> + * controller reply via mfi_tbolt_complete_cmd.
> + */
> + hdr->flags &= ~MFI_FRAME_DONT_POST_IN_REPLY_QUEUE;
> + }
> +
> MFI_WRITE4(sc, MFI_ILQP, (req_desc->words & 0xFFFFFFFF));
> MFI_WRITE4(sc, MFI_IHQP, (req_desc->words >>0x20));
>
> if ((cm->cm_flags & MFI_CMD_POLLED) == 0)
> return 0;
>
> + if (cm->cm_flags & MFI_CMD_SCSI) {
> + /* check reply queue */
> + mfi_tbolt_complete_cmd(sc);
> + }
> +
> /* This is a polled command, so busy-wait for it to complete. */
> while (hdr->cmd_status == MFI_STAT_INVALID_STATUS) {
> DELAY(1000);
> tm -= 1;
> if (tm <= 0)
> - break;
> + break;
> + if (cm->cm_flags & MFI_CMD_SCSI) {
> + /* check reply queue */
> + mfi_tbolt_complete_cmd(sc);
> + }
> }
>
> if (hdr->cmd_status == MFI_STAT_INVALID_STATUS) {
> @@ -1375,7 +1330,7 @@
> free(ld_sync, M_MFIBUF);
> goto out;
> }
> -
> +
> context = cmd->cm_frame->header.context;
> bzero(cmd->cm_frame, sizeof(union mfi_frame));
> cmd->cm_frame->header.context = context;
> Index: mfi_disk.c
> ===================================================================
> --- mfi_disk.c (revision 242617)
> +++ mfi_disk.c (working copy)
> @@ -93,6 +93,7 @@
> {
> struct mfi_disk *sc;
> struct mfi_ld_info *ld_info;
> + struct mfi_disk_pending *ld_pend;
> uint64_t sectors;
> uint32_t secsize;
> char *state;
> @@ -111,6 +112,13 @@
> secsize = MFI_SECTOR_LEN;
> mtx_lock(&sc->ld_controller->mfi_io_lock);
> TAILQ_INSERT_TAIL(&sc->ld_controller->mfi_ld_tqh, sc, ld_link);
> + TAILQ_FOREACH(ld_pend, &sc->ld_controller->mfi_ld_pend_tqh,
> + ld_link) {
> + TAILQ_REMOVE(&sc->ld_controller->mfi_ld_pend_tqh,
> + ld_pend, ld_link);
> + free(ld_pend, M_MFIBUF);
> + break;
> + }
> mtx_unlock(&sc->ld_controller->mfi_io_lock);
>
> switch (ld_info->ld_config.params.state) {
> @@ -131,16 +139,16 @@
> break;
> }
>
> - if ( strlen(ld_info->ld_config.properties.name) == 0 ) {
> - device_printf(dev,
> - "%juMB (%ju sectors) RAID volume (no label) is %s\n",
> - sectors / (1024 * 1024 / secsize), sectors, state);
> - } else {
> - device_printf(dev,
> - "%juMB (%ju sectors) RAID volume '%s' is %s\n",
> - sectors / (1024 * 1024 / secsize), sectors,
> - ld_info->ld_config.properties.name, state);
> - }
> + if ( strlen(ld_info->ld_config.properties.name) == 0 ) {
> + device_printf(dev,
> + "%juMB (%ju sectors) RAID volume (no label) is %s\n",
> + sectors / (1024 * 1024 / secsize), sectors, state);
> + } else {
> + device_printf(dev,
> + "%juMB (%ju sectors) RAID volume '%s' is %s\n",
> + sectors / (1024 * 1024 / secsize), sectors,
> + ld_info->ld_config.properties.name, state);
> + }
>
> sc->ld_disk = disk_alloc();
> sc->ld_disk->d_drv1 = sc;
> Index: mfivar.h
> ===================================================================
> --- mfivar.h (revision 242617)
> +++ mfivar.h (working copy)
> @@ -106,6 +106,7 @@
> #define MFI_ON_MFIQ_READY (1<<6)
> #define MFI_ON_MFIQ_BUSY (1<<7)
> #define MFI_ON_MFIQ_MASK ((1<<5)|(1<<6)|(1<<7))
> +#define MFI_CMD_SCSI (1<<8)
> uint8_t retry_for_fw_reset;
> void (* cm_complete)(struct mfi_command *cm);
> void *cm_private;
> @@ -126,6 +127,11 @@
> #define MFI_DISK_FLAGS_DISABLED 0x02
> };
>
> +struct mfi_disk_pending {
> + TAILQ_ENTRY(mfi_disk_pending) ld_link;
> + int ld_id;
> +};
> +
> struct mfi_system_pd {
> TAILQ_ENTRY(mfi_system_pd) pd_link;
> device_t pd_dev;
> @@ -137,6 +143,11 @@
> int pd_flags;
> };
>
> +struct mfi_system_pending {
> + TAILQ_ENTRY(mfi_system_pending) pd_link;
> + int pd_id;
> +};
> +
> struct mfi_evt_queue_elm {
> TAILQ_ENTRY(mfi_evt_queue_elm) link;
> struct mfi_evt_detail detail;
> @@ -285,6 +296,8 @@
>
> TAILQ_HEAD(,mfi_disk) mfi_ld_tqh;
> TAILQ_HEAD(,mfi_system_pd) mfi_syspd_tqh;
> + TAILQ_HEAD(,mfi_disk_pending) mfi_ld_pend_tqh;
> + TAILQ_HEAD(,mfi_system_pending) mfi_syspd_pend_tqh;
> eventhandler_tag mfi_eh;
> struct cdev *mfi_cdev;
>
> @@ -421,7 +434,8 @@
> extern void mfi_tbolt_sync_map_info(struct mfi_softc *sc);
> extern void mfi_handle_map_sync(void *context, int pending);
> extern int mfi_dcmd_command(struct mfi_softc *, struct mfi_command **,
> - uint32_t, void **, size_t);
> + uint32_t, void **, size_t);
> +extern int mfi_build_cdb(int, uint8_t, u_int64_t, u_int32_t, uint8_t *);
>
> #define MFIQ_ADD(sc, qname) \
> do { \
> Index: mfi.c
> ===================================================================
> --- mfi.c (revision 242617)
> +++ mfi.c (working copy)
> @@ -106,11 +106,9 @@
> static struct mfi_command * mfi_bio_command(struct mfi_softc *);
> static void mfi_bio_complete(struct mfi_command *);
> static struct mfi_command *mfi_build_ldio(struct mfi_softc *,struct bio*);
> -static int mfi_build_syspd_cdb(struct mfi_pass_frame *pass, uint32_t block_count,
> - uint64_t lba, uint8_t byte2, int readop);
> static struct mfi_command *mfi_build_syspdio(struct mfi_softc *,struct bio*);
> static int mfi_send_frame(struct mfi_softc *, struct mfi_command *);
> -static int mfi_abort(struct mfi_softc *, struct mfi_command *);
> +static int mfi_abort(struct mfi_softc *, struct mfi_command **);
> static int mfi_linux_ioctl_int(struct cdev *, u_long, caddr_t, int, struct thread *);
> static void mfi_timeout(void *);
> static int mfi_user_command(struct mfi_softc *,
> @@ -376,6 +374,8 @@
> sx_init(&sc->mfi_config_lock, "MFI config");
> TAILQ_INIT(&sc->mfi_ld_tqh);
> TAILQ_INIT(&sc->mfi_syspd_tqh);
> + TAILQ_INIT(&sc->mfi_ld_pend_tqh);
> + TAILQ_INIT(&sc->mfi_syspd_pend_tqh);
> TAILQ_INIT(&sc->mfi_evt_queue);
> TASK_INIT(&sc->mfi_evt_task, 0, mfi_handle_evt, sc);
> TASK_INIT(&sc->mfi_map_sync_task, 0, mfi_handle_map_sync, sc);
> @@ -1281,6 +1281,17 @@
> struct mfi_command *cm;
> int error;
>
> +
> + if (sc->mfi_aen_cm)
> + sc->cm_aen_abort = 1;
> + if (sc->mfi_aen_cm != NULL)
> + mfi_abort(sc, &sc->mfi_aen_cm);
> +
> + if (sc->mfi_map_sync_cm)
> + sc->cm_map_abort = 1;
> + if (sc->mfi_map_sync_cm != NULL)
> + mfi_abort(sc, &sc->mfi_map_sync_cm);
> +
> mtx_lock(&sc->mfi_io_lock);
> error = mfi_dcmd_command(sc, &cm, MFI_DCMD_CTRL_SHUTDOWN, NULL, 0);
> if (error) {
> @@ -1288,12 +1299,6 @@
> return (error);
> }
>
> - if (sc->mfi_aen_cm != NULL)
> - mfi_abort(sc, sc->mfi_aen_cm);
> -
> - if (sc->mfi_map_sync_cm != NULL)
> - mfi_abort(sc, sc->mfi_map_sync_cm);
> -
> dcmd = &cm->cm_frame->dcmd;
> dcmd->header.flags = MFI_FRAME_DIR_NONE;
> cm->cm_flags = MFI_CMD_POLLED;
> @@ -1315,6 +1320,7 @@
> struct mfi_command *cm = NULL;
> struct mfi_pd_list *pdlist = NULL;
> struct mfi_system_pd *syspd, *tmp;
> + struct mfi_system_pending *syspd_pend;
> int error, i, found;
>
> sx_assert(&sc->mfi_config_lock, SA_XLOCKED);
> @@ -1355,6 +1361,10 @@
> if (syspd->pd_id == pdlist->addr[i].device_id)
> found = 1;
> }
> + TAILQ_FOREACH(syspd_pend, &sc->mfi_syspd_pend_tqh, pd_link) {
> + if (syspd_pend->pd_id == pdlist->addr[i].device_id)
> + found = 1;
> + }
> if (found == 0)
> mfi_add_sys_pd(sc, pdlist->addr[i].device_id);
> }
> @@ -1390,6 +1400,7 @@
> struct mfi_command *cm = NULL;
> struct mfi_ld_list *list = NULL;
> struct mfi_disk *ld;
> + struct mfi_disk_pending *ld_pend;
> int error, i;
>
> sx_assert(&sc->mfi_config_lock, SA_XLOCKED);
> @@ -1418,6 +1429,10 @@
> if (ld->ld_id == list->ld_list[i].ld.v.target_id)
> goto skip_add;
> }
> + TAILQ_FOREACH(ld_pend, &sc->mfi_ld_pend_tqh, ld_link) {
> + if (ld_pend->ld_id == list->ld_list[i].ld.v.target_id)
> + goto skip_add;
> + }
> mfi_add_ld(sc, list->ld_list[i].ld.v.target_id);
> skip_add:;
> }
> @@ -1620,9 +1635,7 @@
> < current_aen.members.evt_class)
> current_aen.members.evt_class =
> prior_aen.members.evt_class;
> - mtx_lock(&sc->mfi_io_lock);
> - mfi_abort(sc, sc->mfi_aen_cm);
> - mtx_unlock(&sc->mfi_io_lock);
> + mfi_abort(sc, &sc->mfi_aen_cm);
> }
> }
>
> @@ -1814,10 +1827,17 @@
> struct mfi_command *cm;
> struct mfi_dcmd_frame *dcmd = NULL;
> struct mfi_ld_info *ld_info = NULL;
> + struct mfi_disk_pending *ld_pend;
> int error;
>
> mtx_assert(&sc->mfi_io_lock, MA_OWNED);
>
> + ld_pend = malloc(sizeof(*ld_pend), M_MFIBUF, M_NOWAIT | M_ZERO);
> + if (ld_pend != NULL) {
> + ld_pend->ld_id = id;
> + TAILQ_INSERT_TAIL(&sc->mfi_ld_pend_tqh, ld_pend, ld_link);
> + }
> +
> error = mfi_dcmd_command(sc, &cm, MFI_DCMD_LD_GET_INFO,
> (void **)&ld_info, sizeof(*ld_info));
> if (error) {
> @@ -1858,11 +1878,13 @@
> hdr = &cm->cm_frame->header;
> ld_info = cm->cm_private;
>
> - if (hdr->cmd_status != MFI_STAT_OK) {
> + if (sc->cm_map_abort || hdr->cmd_status != MFI_STAT_OK) {
> free(ld_info, M_MFIBUF);
> + wakeup(&sc->mfi_map_sync_cm);
> mfi_release_command(cm);
> return;
> }
> + wakeup(&sc->mfi_map_sync_cm);
> mfi_release_command(cm);
>
> mtx_unlock(&sc->mfi_io_lock);
> @@ -1887,10 +1909,17 @@
> struct mfi_command *cm;
> struct mfi_dcmd_frame *dcmd = NULL;
> struct mfi_pd_info *pd_info = NULL;
> + struct mfi_system_pending *syspd_pend;
> int error;
>
> mtx_assert(&sc->mfi_io_lock, MA_OWNED);
>
> + syspd_pend = malloc(sizeof(*syspd_pend), M_MFIBUF, M_NOWAIT | M_ZERO);
> + if (syspd_pend != NULL) {
> + syspd_pend->pd_id = id;
> + TAILQ_INSERT_TAIL(&sc->mfi_syspd_pend_tqh, syspd_pend, pd_link);
> + }
> +
> error = mfi_dcmd_command(sc, &cm, MFI_DCMD_PD_GET_INFO,
> (void **)&pd_info, sizeof(*pd_info));
> if (error) {
> @@ -1985,9 +2014,12 @@
> return cm;
> }
>
> -static int
> -mfi_build_syspd_cdb(struct mfi_pass_frame *pass, uint32_t block_count,
> - uint64_t lba, uint8_t byte2, int readop)
> +/*
> + * mostly copied from cam/scsi/scsi_all.c:scsi_read_write
> + */
> +
> +int
> +mfi_build_cdb(int readop, uint8_t byte2, u_int64_t lba, u_int32_t block_count, uint8_t *cdb)
> {
> int cdb_len;
>
> @@ -1997,7 +2029,7 @@
> /* We can fit in a 6 byte cdb */
> struct scsi_rw_6 *scsi_cmd;
>
> - scsi_cmd = (struct scsi_rw_6 *)&pass->cdb;
> + scsi_cmd = (struct scsi_rw_6 *)cdb;
> scsi_cmd->opcode = readop ? READ_6 : WRITE_6;
> scsi_ulto3b(lba, scsi_cmd->addr);
> scsi_cmd->length = block_count & 0xff;
> @@ -2007,7 +2039,7 @@
> /* Need a 10 byte CDB */
> struct scsi_rw_10 *scsi_cmd;
>
> - scsi_cmd = (struct scsi_rw_10 *)&pass->cdb;
> + scsi_cmd = (struct scsi_rw_10 *)cdb;
> scsi_cmd->opcode = readop ? READ_10 : WRITE_10;
> scsi_cmd->byte2 = byte2;
> scsi_ulto4b(lba, scsi_cmd->addr);
> @@ -2020,7 +2052,7 @@
> /* Block count is too big for 10 byte CDB use a 12 byte CDB */
> struct scsi_rw_12 *scsi_cmd;
>
> - scsi_cmd = (struct scsi_rw_12 *)&pass->cdb;
> + scsi_cmd = (struct scsi_rw_12 *)cdb;
> scsi_cmd->opcode = readop ? READ_12 : WRITE_12;
> scsi_cmd->byte2 = byte2;
> scsi_ulto4b(lba, scsi_cmd->addr);
> @@ -2035,7 +2067,7 @@
> */
> struct scsi_rw_16 *scsi_cmd;
>
> - scsi_cmd = (struct scsi_rw_16 *)&pass->cdb;
> + scsi_cmd = (struct scsi_rw_16 *)cdb;
> scsi_cmd->opcode = readop ? READ_16 : WRITE_16;
> scsi_cmd->byte2 = byte2;
> scsi_u64to8b(lba, scsi_cmd->addr);
> @@ -2053,15 +2085,15 @@
> {
> struct mfi_command *cm;
> struct mfi_pass_frame *pass;
> - int flags = 0;
> + uint32_t context = 0;
> + int flags = 0, blkcount = 0, readop;
> uint8_t cdb_len;
> - uint32_t block_count, context = 0;
>
> if ((cm = mfi_dequeue_free(sc)) == NULL)
> return (NULL);
>
> /* Zero out the MFI frame */
> - context = cm->cm_frame->header.context;
> + context = cm->cm_frame->header.context;
> bzero(cm->cm_frame, sizeof(union mfi_frame));
> cm->cm_frame->header.context = context;
> pass = &cm->cm_frame->pass;
> @@ -2070,22 +2102,24 @@
> switch (bio->bio_cmd & 0x03) {
> case BIO_READ:
> flags = MFI_CMD_DATAIN;
> + readop = 1;
> break;
> case BIO_WRITE:
> flags = MFI_CMD_DATAOUT;
> + readop = 0;
> break;
> default:
> /* TODO: what about BIO_DELETE??? */
> - panic("Unsupported bio command");
> + panic("Unsupported bio command %x\n", bio->bio_cmd);
> }
>
> /* Cheat with the sector length to avoid a non-constant division */
> - block_count = (bio->bio_bcount + MFI_SECTOR_LEN - 1) / MFI_SECTOR_LEN;
> + blkcount = (bio->bio_bcount + MFI_SECTOR_LEN - 1) / MFI_SECTOR_LEN;
> /* Fill the LBA and Transfer length in CDB */
> - cdb_len = mfi_build_syspd_cdb(pass, block_count, bio->bio_pblkno, 0,
> - flags == MFI_CMD_DATAIN);
> -
> + cdb_len = mfi_build_cdb(readop, 0, bio->bio_pblkno, blkcount,
> + pass->cdb);
> pass->header.target_id = (uintptr_t)bio->bio_driver1;
> + pass->header.lun_id = 0;
> pass->header.timeout = 0;
> pass->header.flags = 0;
> pass->header.scsi_status = 0;
> @@ -2132,7 +2166,7 @@
> break;
> default:
> /* TODO: what about BIO_DELETE??? */
> - panic("Unsupported bio command");
> + panic("Unsupported bio command %x\n", bio->bio_cmd);
> }
>
> /* Cheat with the sector length to avoid a non-constant division */
> @@ -2422,15 +2456,14 @@
> }
>
> static int
> -mfi_abort(struct mfi_softc *sc, struct mfi_command *cm_abort)
> +mfi_abort(struct mfi_softc *sc, struct mfi_command **cm_abort)
> {
> struct mfi_command *cm;
> struct mfi_abort_frame *abort;
> int i = 0;
> uint32_t context = 0;
>
> - mtx_assert(&sc->mfi_io_lock, MA_OWNED);
> -
> + mtx_lock(&sc->mfi_io_lock);
> if ((cm = mfi_dequeue_free(sc)) == NULL) {
> return (EBUSY);
> }
> @@ -2444,29 +2477,27 @@
> abort->header.cmd = MFI_CMD_ABORT;
> abort->header.flags = 0;
> abort->header.scsi_status = 0;
> - abort->abort_context = cm_abort->cm_frame->header.context;
> - abort->abort_mfi_addr_lo = (uint32_t)cm_abort->cm_frame_busaddr;
> + abort->abort_context = (*cm_abort)->cm_frame->header.context;
> + abort->abort_mfi_addr_lo = (uint32_t)(*cm_abort)->cm_frame_busaddr;
> abort->abort_mfi_addr_hi =
> - (uint32_t)((uint64_t)cm_abort->cm_frame_busaddr >> 32);
> + (uint32_t)((uint64_t)(*cm_abort)->cm_frame_busaddr >> 32);
> cm->cm_data = NULL;
> cm->cm_flags = MFI_CMD_POLLED;
>
> - if (sc->mfi_aen_cm)
> - sc->cm_aen_abort = 1;
> - if (sc->mfi_map_sync_cm)
> - sc->cm_map_abort = 1;
> mfi_mapcmd(sc, cm);
> mfi_release_command(cm);
>
> - while (i < 5 && sc->mfi_aen_cm != NULL) {
> - msleep(&sc->mfi_aen_cm, &sc->mfi_io_lock, 0, "mfiabort",
> + mtx_unlock(&sc->mfi_io_lock);
> + while (i < 5 && *cm_abort != NULL) {
> + tsleep(cm_abort, 0, "mfiabort",
> 5 * hz);
> i++;
> }
> - while (i < 5 && sc->mfi_map_sync_cm != NULL) {
> - msleep(&sc->mfi_map_sync_cm, &sc->mfi_io_lock, 0, "mfiabort",
> - 5 * hz);
> - i++;
> + if (*cm_abort != NULL) {
> + /* Force a complete if command didn't abort */
> + mtx_lock(&sc->mfi_io_lock);
> + (*cm_abort)->cm_complete(*cm_abort);
> + mtx_unlock(&sc->mfi_io_lock);
> }
>
> return (0);
> @@ -2522,7 +2553,7 @@
> {
> struct mfi_command *cm;
> struct mfi_pass_frame *pass;
> - int error;
> + int error, readop, cdb_len;
> uint32_t blkcount;
>
> if ((cm = mfi_dequeue_free(sc)) == NULL)
> @@ -2531,21 +2562,24 @@
> pass = &cm->cm_frame->pass;
> bzero(pass->cdb, 16);
> pass->header.cmd = MFI_CMD_PD_SCSI_IO;
> +
> + readop = 0;
> blkcount = (len + MFI_SECTOR_LEN - 1) / MFI_SECTOR_LEN;
> + cdb_len = mfi_build_cdb(readop, 0, lba, blkcount, pass->cdb);
> pass->header.target_id = id;
> pass->header.timeout = 0;
> pass->header.flags = 0;
> pass->header.scsi_status = 0;
> pass->header.sense_len = MFI_SENSE_LEN;
> pass->header.data_len = len;
> - pass->header.cdb_len = mfi_build_syspd_cdb(pass, blkcount, lba, 0, 0);
> + pass->header.cdb_len = cdb_len;
> pass->sense_addr_lo = (uint32_t)cm->cm_sense_busaddr;
> pass->sense_addr_hi = (uint32_t)((uint64_t)cm->cm_sense_busaddr >> 32);
> cm->cm_data = virt;
> cm->cm_len = len;
> cm->cm_sg = &pass->sgl;
> cm->cm_total_frame_size = MFI_PASS_FRAME_SIZE;
> - cm->cm_flags = MFI_CMD_POLLED | MFI_CMD_DATAOUT;
> + cm->cm_flags = MFI_CMD_POLLED | MFI_CMD_DATAOUT | MFI_CMD_SCSI;
>
> error = mfi_mapcmd(sc, cm);
> bus_dmamap_sync(sc->mfi_buffer_dmat, cm->cm_dmamap,
> @@ -2745,16 +2779,24 @@
> }
> }
>
> -static int mfi_check_for_sscd(struct mfi_softc *sc, struct mfi_command *cm)
> +static int
> +mfi_check_for_sscd(struct mfi_softc *sc, struct mfi_command *cm)
> {
> - struct mfi_config_data *conf_data=(struct mfi_config_data *)cm->cm_data;
> + struct mfi_config_data *conf_data;
> struct mfi_command *ld_cm = NULL;
> struct mfi_ld_info *ld_info = NULL;
> + struct mfi_ld_config *ld;
> + char *p;
> int error = 0;
>
> - if ((cm->cm_frame->dcmd.opcode == MFI_DCMD_CFG_ADD) &&
> - (conf_data->ld[0].params.isSSCD == 1)) {
> - error = 1;
> + conf_data = (struct mfi_config_data *)cm->cm_data;
> +
> + if (cm->cm_frame->dcmd.opcode == MFI_DCMD_CFG_ADD) {
> + p = (char *)conf_data->array;
> + p += conf_data->array_size * conf_data->array_count;
> + ld = (struct mfi_ld_config *)p;
> + if (ld->params.isSSCD == 1)
> + error = 1;
> } else if (cm->cm_frame->dcmd.opcode == MFI_DCMD_LD_DELETE) {
> error = mfi_dcmd_command (sc, &ld_cm, MFI_DCMD_LD_GET_INFO,
> (void **)&ld_info, sizeof(*ld_info));
> Index: mfi_cam.c
> ===================================================================
> --- mfi_cam.c (revision 242617)
> +++ mfi_cam.c (working copy)
> @@ -79,6 +79,11 @@
> static struct mfi_command * mfip_start(void *);
> static void mfip_done(struct mfi_command *cm);
>
> +static int mfi_allow_disks = 0;
> +TUNABLE_INT("hw.mfi.allow_cam_disk_passthrough", &mfi_allow_disks);
> +SYSCTL_INT(_hw_mfi, OID_AUTO, allow_cam_disk_passthrough, CTLFLAG_RD,
> + &mfi_allow_disks, 0, "event message locale");
> +
> static devclass_t mfip_devclass;
> static device_method_t mfip_methods[] = {
> DEVMETHOD(device_probe, mfip_probe),
> @@ -349,7 +354,8 @@
> command = csio->cdb_io.cdb_bytes[0];
> if (command == INQUIRY) {
> device = csio->data_ptr[0] & 0x1f;
> - if ((device == T_DIRECT) || (device == T_PROCESSOR))
> + if ((!mfi_allow_disks && device == T_DIRECT) ||
> + (device == T_PROCESSOR))
> csio->data_ptr[0] =
> (csio->data_ptr[0] & 0xe0) | T_NODEVICE;
> }
> Index: mfi_syspd.c
> ===================================================================
> --- mfi_syspd.c (revision 242617)
> +++ mfi_syspd.c (working copy)
> @@ -89,7 +89,6 @@
> static int
> mfi_syspd_probe(device_t dev)
> {
> -
> return (0);
> }
>
> @@ -98,12 +97,12 @@
> {
> struct mfi_system_pd *sc;
> struct mfi_pd_info *pd_info;
> + struct mfi_system_pending *syspd_pend;
> uint64_t sectors;
> uint32_t secsize;
>
> sc = device_get_softc(dev);
> pd_info = device_get_ivars(dev);
> -
> sc->pd_dev = dev;
> sc->pd_id = pd_info->ref.v.device_id;
> sc->pd_unit = device_get_unit(dev);
> @@ -115,6 +114,13 @@
> secsize = MFI_SECTOR_LEN;
> mtx_lock(&sc->pd_controller->mfi_io_lock);
> TAILQ_INSERT_TAIL(&sc->pd_controller->mfi_syspd_tqh, sc, pd_link);
> + TAILQ_FOREACH(syspd_pend, &sc->pd_controller->mfi_syspd_pend_tqh,
> + pd_link) {
> + TAILQ_REMOVE(&sc->pd_controller->mfi_syspd_pend_tqh,
> + syspd_pend, pd_link);
> + free(syspd_pend, M_MFIBUF);
> + break;
> + }
> mtx_unlock(&sc->pd_controller->mfi_io_lock);
> device_printf(dev, "%juMB (%ju sectors) SYSPD volume\n",
> sectors / (1024 * 1024 / secsize), sectors);
> @@ -139,6 +145,7 @@
> disk_create(sc->pd_disk, DISK_VERSION);
>
> device_printf(dev, " SYSPD volume attached\n");
> +
> return (0);
> }
>
>
> Thanks,
>
> Doug A.
>
================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.
More information about the freebsd-stable
mailing list