Thoughts about CAM SWI
Alexander Motin
mav at FreeBSD.org
Wed Jan 4 15:50:12 UTC 2012
Hi.
Many times was risen question about extra context switch in CAM from
interrupt thread to the CAM SWI on command completion. I've tried to
analyze the ways how can it be avoided. The main problem I see there is
a problems with reenterability of CAM itself, peripheral drivers and SIMs.
In general case it looks unsafe to handle command completion directly
from the xpt_action() call, as caller may not expect state changes at
this point. It also looks unsafe to handle command completion directly
from xpt_done() in interrupt thread, as SIM may not expect new requests
getting in at this point, for example, if some error recovery is in
progress, or it needs state to be consistent to handle other completed
(possibly with errors) requests. The most complicated case I see is if
the xpt_done() called from inside of sim_action() method of SIM. In that
case direct command completion may recursively cause problems for all sides.
I've tried to find places where this call loop can be broken. First
place I was thinking about was the xpt_action() call. It looks possible
to turn the code upside down to handle completion directly, but to not
submit new requests to the controller immediately if we are called from
the SIM context (another question is how to identify it). The problem is
that there are set of non-queued SIM requests that may affect it's
state, for example, XPT_SET_TRAN_SETTINGS or XPT_RESET_BUS. Positive
side it that we may hope to avoid SIM modification.
Second idea is to allow SIMs to be more reenterable. It can be done in
two ways simultaneously on the SIM author's choice:
- adding another version of xpt_done(), like xpt_done_direct() that
would allowed immediate command completion if SIM is sure it can handle
reentrancy at this point.
- adding some functions, like xpt_batch_start() and xpt_batch_done()
instructing CAM to queue xpt_done() calls as usual between them, but to
not run SWI, instead handle them directly on the xpt_batch_done() call,
that supposed to be called at point where SIM state is consistent and
permits reenterability.
This approach is very simple from the CAM point of view, but needs small
SIMs modifications, while keeping full compatibility with unmodified.
I've made the patch implementing the last way for all ATA SIMs:
http://people.freebsd.org/~mav/cam_batch.patch
Results can be illustrated by simple synthetic test:
dd if=/dev/ada0 of=/dev/null bs=512 count=500000
, where ada0 is Intel SATA SSD:
x before
+ after
+------------------------------------------------------------+
| x + + |
|xxxx xxx +++ + + +|
| |_A__| |_MA___| |
+------------------------------------------------------------+
N Min Max Median Avg Stddev
x 8 16186671 16270294 16222511 16227889 28506.033
+ 8 16802958 16929107 16832671 16843965 43561.511
Difference at 95.0% confidence
616076 +/- 39480.5
3.7964% +/- 0.243288%
(Student's t, pooled s = 36811.7)
Total number of context switches in system reduced from 315K to 260K.
Removed context switches gave 3.8% speedup.
May be that is not much, and affects only specific situations of the
sequential very high request rate workaloads, but it is almost free.
--
Alexander Motin
More information about the freebsd-scsi
mailing list