svn commit: r238907 - projects/calloutng/sys/kern
Bruce Evans
brde at optusnet.com.au
Thu Nov 8 17:26:14 UTC 2012
On Fri, 2 Nov 2012, Attilio Rao wrote:
> On 10/29/12, Bruce Evans <brde at optusnet.com.au> wrote:
>> On Mon, 29 Oct 2012, Attilio Rao wrote:
>>
>>> Now that sched_pin()/sched_unpin() are fixed I would like to introduce
>>> this new patch, making critical_enter()/critical_exit() inline:
>>> http://www.freebsd.org/~attilio/inline_critical.patch
>> ...
>> My version goes the other way and uninlines mtx_lock_spin() and
>> mtx_unlock_spin(). Then it inlines (open codes) critical_enter() and
>> ...
>
> So, I thought more about this and I think that inlining
> critical_exit() is not really going to bring any benefit here but
> bloat.
> This is because nested critical sections are rare rather not, which
Rather rare !not? :-)
> means you will end up in doing the function call most of the time and
> plus we have the possible pessimization introduced by the memory
> clobber (__compiler_membar()) and as you note possible deficiency
> caming from the branch misprediction*.
This seems best.
I see a point about the rareness of the branch in critical_exit().
Not sure if it is the one you are making: since the nested case is
rare, then the branch will normally be correctly predicted. If the
function remains uninlined, then the branch still has a good chance
of being correctly predicted. This depends on the nested case being
so rare across all callers, that the non-nested case doesn't mess
up the prediction by happening often. The branch predictor only
has to maintain history for 1 branch for this. However, if the
call is inlined and there are many callers, there might be too many
to maintain history for them all.
> I feel very unsure style-wise about my add to sys/systm.h because it
> is already very messy that I cannot give a logical sense to that
> chunk.
It is less bad than most places. Some of the inlines in libkern.h
are misplaced. However, the bitcount inlines in systm.h belong
closer to libkern (or /dev/null). The split between systm.h and
kernel.h is bogus. But systemy things like critical* don't
belong near libkern.
> * In theory we could use __predict_false() or similar but I don't
> think this will really help as x86 doesn't have explicit instructions
> to control branch prediction
It can help as follows on x86:
- modern x86 has a default for when the branch predictor is cold. The
compiler should arrange the code so that the default matches the
usual case. Compilers can either follow the code order or guess
which branch is usual (-fpredict-branch-probability IIRC). The
branches in critical_exit() and x86 spinlock_enter() seem to have
a good order for the first, but might be mispredicted by the compiler.
The branch in spinlock_exit() seems to be in a bad order for the first.
This doesn't matter if the CPU's branch predictor stays warm.
- compilers can help the CPU's branch predictor and instruction prefetcher
by moving the unusual case far away. This helps mainly for large code.
The code in critical_exit() is not large, except possibly if it is
inlined,
But __predict_false() never did much for me, perhaps because I test mainly
non-large code in loops where branch predictors stay warm.
Bruce
More information about the svn-src-projects
mailing list