mps reinitialization triggered system panic
prateek sethi
prateekrootkey at gmail.com
Sun May 29 09:52:44 UTC 2016
I have fix this issue by doing multiple retries for mps request sync in
case of failure. According to my test results almost 15% changes are there
to hit this issue without using this fix. Which is certainly not a good
ratio.
Please comment if anything is wrong in this.
Index: mps.c
===================================================================
--- mps.c (revision 2474)
+++ mps.c (working copy)
@@ -986,8 +985,8 @@
{
MPI2_DEFAULT_REPLY *reply;
MPI2_IOC_FACTS_REQUEST request;
- int error, req_sz, reply_sz;
-
+ int error, req_sz, reply_sz, retry = 0;
+
MPS_FUNCTRACE(sc);
req_sz = sizeof(MPI2_IOC_FACTS_REQUEST);
@@ -996,8 +995,13 @@
bzero(&request, req_sz);
request.Function = MPI2_FUNCTION_IOC_FACTS;
- error = mps_request_sync(sc, &request, reply, req_sz, reply_sz, 5);
-
+ while(retry < 5){
+ error = mps_request_sync(sc, &request, reply, req_sz,
reply_sz, 5);
+ if(!error)
+ break;
+ mps_dprint(sc, mps_request_sync is failed retry
%d\n",retry);
+ retry++;
+ }
return (error);
}
@@ -1006,7 +1010,7 @@
{
MPI2_IOC_INIT_REQUEST init;
MPI2_DEFAULT_REPLY reply;
- int req_sz, reply_sz, error;
+ int req_sz, reply_sz, error, retry = 0;
struct timeval now;
uint64_t time_in_msec;
@@ -1041,11 +1045,16 @@
time_in_msec = (now.tv_sec * 1000 + now.tv_usec/1000);
init.TimeStamp.High = htole32((time_in_msec >> 32) & 0xFFFFFFFF);
init.TimeStamp.Low = htole32(time_in_msec & 0xFFFFFFFF);
+ while(retry < 5){
+ error = mps_request_sync(sc, &init, &reply, req_sz,
reply_sz, 5);
+ if ((reply.IOCStatus & MPI2_IOCSTATUS_MASK) !=
MPI2_IOCSTATUS_SUCCESS)
+ error = ENXIO;
+ if(!error)
+ break;
+ mps_dprint(sc, MPS_FAULT, " mps_request_sync is failed
retry %d\n",retry);
+ retry++;
+ }
- error = mps_request_sync(sc, &init, &reply, req_sz, reply_sz, 5);
- if ((reply.IOCStatus & MPI2_IOCSTATUS_MASK) !=
MPI2_IOCSTATUS_SUCCESS)
- error = ENXIO;
-
mps_dprint(sc, MPS_INIT, "IOCInit status= 0x%x\n", reply.IOCStatus);
return (error);
}
On Tue, Apr 12, 2016 at 6:43 PM, prateek sethi <prateekrootkey at gmail.com>
wrote:
> I am using LSI SAS2308 HBA card with freebsd 9.2. My system got panic
> during mps reinitialization process. I tried to debug the core and found
> that driver was failed to allocate iocfacts, triggered this panic. I also
> got "*Doorbell failed to activate*" message in the core file.
>
> Please help me to find the answers of some questions like:-
>
> 1. What can be the reasons for Doorbell activation failure?
> 2. How I can fix this issue?
> 3. If it is a H/w issue, how it can be fixed after reboot?
> 4. *Why a driver failure should panic a system?*
>
>
> I think panic is not a good option to get recover from some error. There
> should be some another way to handle this failure.
>
> I am a beginner for the drivers and if I am predicting or asking something
> wrong please correct me.
>
>
> Regards
> Prateek
>
More information about the freebsd-drivers
mailing list