Why is MySQL nearly twice as fast on Linux?

Sun May 23 22:08:36 PDT 2004

On Sun, 23 May 2004, Petri Helenius wrote:

> >There is obviously a bottleneck, but it's very hard to tell what it is..
> >My guess is that the scheduler(s) are not doing a very good job. and the
> >fact that GIANT is not removed from the kernel yet says that generally
> >syscalls will be a bottleneck.
> >
> While watching the top output, I saw a "logjam" to appear from time to 
> time where all processes/threads were waiting for Giant. However I don´t 
> feel that causes the large impact, it might contribute 10-20% but it 
> does not feel frequent enough to cause 50% difference.

top is a little misleading because it has to acquire Giant in order to
check the status of the other processes.  This increases the chance of
Giant contention.  There are at least a few things going on here.  Among
various results, I saw that switching to a UP kernel improved performance,
but not nearly enough.  This suggests lock contention is not the cause of
the problem.  If you want to investigate lock contention, there are a
couple of things you might try:

(1) Compile the kernel with MUTEX_PROFILING -- it has two contention
    measurement fields that can help track contention.  Note that running
    with mutex profiling will dramatically hurt performance, but might
    still be quite informative. 

(2) It might be interesting to run with the netperf patches, as they
    should greatly reduce contention for local UNIX domain socket I/O.  I
    haven't tried any benchmarking with MySQL, but it might be worth a
    try.  You can find information on the ongoing work at: 

	http://www.watson.org/~robert/freebsd/netperf/

    The work is moving fairly fast, as I'm working on tracking down
    additional socket nits, but it could help.

> >ULE should be able to do a better job at scheduling with
> >multiple CPUs but it is a work in progress. If threads all hit a GIANT 
> >based logjam, there is not a lot the scheduler can do about it..
> >
> I find it hard to believe that the threading stuff would be seriously 
> broken since we do large processing with libkse and don´t have issues 
> with the performance. However I´m observing about 50000 context switches 
> but only 5000 syscalls a second. (I know it´s a different application 
> but also for 1500 queries a second 70000 syscalls sounds excessive).

ULE has some sort of known load balancing problem between multiple CPUs --
I've observed it on some local benchmarking with ubench, at least a month
or so ago.  It seemed to provide highly busy processes derived from the
same process tree from migrating properly.  SCHED_4BSD did not have this
problem.  Since we've seen results suggesting changing to SCHED_4BSD
didn't help all that much easier, it's still likely not to be the cause.

A few months ago I did some work to optimize system call cost a bit -- we
had some extra mutex operations.  It might be interesting to use ktrace or
truss to generate a profile of the system call mix in use, perhaps that
would give some informative results about things to look at.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org      Senior Research Scientist, McAfee Research