[Bug 212841] getting panic during mps reinitialization.
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Tue Sep 27 19:18:30 UTC 2016
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841
--- Comment #11 from Stephen McConnell <slm at freebsd.org> ---
The reset timing in the driver looks fine to me. There is a requirement that
the host wait a certain amount of time when it first accesses the controller
during a reset, and then a certain time to wait on checking registers, etc.
But, it looks fine.
What doesn't make sense is that you're waiting some arbitrary amount of time
after the initial failure and then it works. This time that your waiting is
after the reset completes and then after some calls to other functions. After
all of that, some access to the DOORBELL fails. Then, waiting 2 mSecs fixes it.
That's strange.
There are two ways that this will fail in Step 4 of mps_request_sync(). The
first is when reading the Interrupt Status REG. If this Register does not show
an interrupt within 5 seconds, it fails (that's a really long time). The second
is when reading the DOORBELL REG. If the DOORBELL_USED bit is not set, it
fails. I can't tell which one of these fails. But, because it fails your fix
will just wait 2 mSecs and then retry, then it's successful (at least within 10
mSecs - 5 retries).
What I'm wondering is, does it really matter that you have a delay between
mps_request_sync() calls? To me, it looks like something is messed up in FW and
just doing a retry fixes it.
Now, with all of that said, I'm not sure there really is a better fix except
that the delay may not need to be there. Having the delay there would make
someone think that we're just not waiting long enough, which really is not the
case and looks a little scary, meaning someone could think the driver timing
for this is very fragile, when it's really not.
Sean, let me know what you think about removing the delay. If you want the
delay, I would at least say to add a comment that explains the delay and retry,
since none of this is really supposed to happen and I think it's some FW or HW
workaround.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the freebsd-scsi
mailing list