cvs commit: src/sys/kern kern_mutex.c
Kip Macy
kip.macy at gmail.com
Thu Jun 7 06:24:06 UTC 2007
Bruce -
Can you also say how many runs do you do and how much variance there
is between runs?
Thanks.
-Kip
On 6/6/07, Bruce Evans <brde at optusnet.com.au> wrote:
> On Wed, 6 Jun 2007, Bruce Evans wrote:
>
> > On Tue, 5 Jun 2007, Jeff Roberson wrote:
>
> >> You should try with kern.sched.pick_pri = 0. I have changed this to be the
> >> default recently. This weakens the preemption and speeds up some
> >> workloads.
> >
> > I haven't tried a new SCHED_ULE kernel yet.
>
> Tried now. In my makeworld benchmark, SCHED_ULE is now only 4% slower
> than SCHED_4BSD (after losing 2% in SCHED_4BSD) (down from about 7%
> slower). The difference is still from CPUs idling too much.
>
> Best result ever (SCHED_4BSD, June 4 kernel, no PREEMPTION):
> ---
> 827.48 real 1309.26 user 186.86 sys
> 1332122 voluntary context switches
> 1535129 involuntary context switches
> pagezero time 6 seconds
> ---
>
> After thread lock changes (SCHED_4BSD, no PREEMPTION):
> ---
> 847.70 real 1309.83 user 169.39 sys
> 2933415 voluntary context switches
> 1501808 involuntary context switches
> pagezero time 30 seconds.
>
> Unlike what I wrote before, there is a scheduling bug that affects
> pagezero directly. The bug from last month involving pagezero losing
> its priority of PRI_MAX_IDLE and running at priority PUSER is back.
> This bug seemed to be gone in the June 4 kernel, but actually only
> happens less there. This bug seems to cost 0.5-1.0% real time.
> ---
>
> After thread lock changes (SCHED_4BSD, now with PREEMPTION):
> ---
> 843.34 real 1304.00 user 168.87 sys
> 1651011 voluntary context switches
> 1630988 involuntary context switches
> pagezero time 27 seconds
>
> The problem with the extra context switches is gone (these context switch
> counts are like the ones in old kernels with PREEMPTION). This result is
> affected by pagezero getting its priority clobbered. The best result for
> an old kernel with PREMPTION was about 840 seconds, before various
> optimizations reduced this to 827 seconds (-0+4 seconds).
> ---
>
> Old run with SCHED_ULE (Mar 18):
> 899.50 real 1311.00 user 187.47 sys
> 1566366 voluntary context switches
> 1959436 involuntary context switches
> pagezero time 19 seconds
> ---
>
> Today with SCHED_ULE:
> ---
> 883.65 real 1290.92 user 188.21 sys
> 1658109 voluntary context switches
> 1708148 involuntary context switches
> pagezero time 7 seconds.
> ---
>
> In all of these, the user + sys decomposition is very inaccurate, but the
> (user + sys + pagezero_time) total is fairly accurate. It is 1500+-2 for
> SCHED_4BSD and 1500+-17 for SCHED_ULE (old ULE larger, current ULE smaller).
>
> SCHED_ULE now shows intereting behaviour for non-parallel kernel
> builds on a 2-way SMP machine. It is now slightly faster than SCHED_4BSD
> for this, but still much slower for parallel kernel builds. This might
> be because it likes to leave 1 CPU idle to wait to find a better CPU to
> run on, and this is actually an optimization when there is >= 1 CPU to
> spare:
>
> RELENG_4 kernel build on nfs, non-parallel make.
> Best ever with SCHED_ULE (~June 4 kernel):
> 62.55 real 55.30 user 3.65 sys
> Current with SCHED_ULE:
> 62.18 real 54.91 user 3.51 sys
>
> RELENG_4 kernel build on nfs, make -j4.
> Best ever for SCHED_ULE (~June 4 kernel):
> 32.00 real 56.98 user 3.90 sys
> Current with SCHED_ULE:
> 33.11 real 56.01 user 4.12 sys
> ULE has been about 1 second slower for this since at least last November.
> It presumably reduces user+sys time by running pagezero more.
>
> The slowdown is much larger for a build on ffs:
>
> Non-parallel results not shown (litte difference from above).
>
> RELENG_4 kernel build on ffs, make -j4.
> Best ever for SCHED_ULE (~June 4 kernel):
> 29.94 real 56.03 user 3.12 sys
> Current with SCHED_ULE:
> 32.63 real 55.13 user 3.53 sys
> Now 9% of the real time (= 18% of the cycles on one CPU = almost the
> sys sys overhead) is apparently wasted by leaving one CPU idle. This
> benchmark is of course dominated by many instances of 2 gcc hogs which
> should be scheduled to run in parallel with no idle cycles. (In all
> these kernel benchmarks, everything except disk writes is cached before
> starting. In other makeworld benchmarks, everything is cached before
> starting on the nfs server, while on the client nothing is cached.)
>
> I don't have context switch counts or pagezero times for the kernel builds.
> stathz is 100 = hz. Maybe SCHED_ULE doesn't like this. hz = 100 is
> about 1% faster than hz = 1000 for the makeworld benchmark.
>
> Bruce
>
More information about the cvs-src
mailing list