My wish list for 6.1
Martin Cracauer
cracauer at cons.org
Sat Dec 31 08:25:18 PST 2005
Robert Watson wrote on Sat, Dec 31, 2005 at 07:12:23AM +0000:
>
> On Fri, 16 Dec 2005, Avleen Vig wrote:
>
> > On Fri, Dec 16, 2005 at 10:40:22AM -0500, Martin Cracauer wrote:
> >>> 2. SMP kernels for install. Right now we only install a UP kernel, for
> >>> performance reasons. We should be able to package both a UP and SMP
> >>> kernel into the release bits, and have sysinstall install both. It
> >>> should also select the correct one for the target system and make that
> >>> the default on boot.
> >>
> >> If people are concerned about performance, I benchmarked a 6-beta kernel
> >> SMP versus UP on a socket 939 Opteron.
> >
> > If those results are accurate, there's no real reason not to just use an SMP
> > kernel on default install?
>
> This is an old thread that I'm just catching up on, but I figured I'd chime in
> anyway: you have to be really careful benchmarking across CPU types and
> configurations, as the performance characteristics of important insturctions
> differ a lot across hardware variations. For example, the performance of
> atomic operations, used to synchronize between CPUs, varies significantly by
> CP, bus configuration, etc. On modern opteron hardware, the performance of
> inter-CPU synchronization instructions is blindingly fast. On modern Xeon P4
> hardware, it is incredibly slow.
Well, my runs included P4s and P4-based Xeons, and hyperthreading,
too.
The core of the problem here is that while my parallel benchmarks are
partly system-call exercising, I use apache over localhost and
zero-spaced files to get the disk and network out of the equitation.
I think I have a solid framework in place to run parallel benchmarks
and see the tradeoffs involved, but I need to fill it with activity
that exercises what we want to see.
Still, I bet that my measurements are good enough to label the SMP
kernel "defaultable" for FreeBSD installations, from a performance
standpoint. After all, I *do* test parallel activity, CPU-intensive
and systemcall-intensive and various mixes thereof.
Remember that those people who do a lot of parallel activity and hence
would suffer from the additional locks in the SMP kernel are very
likely to have a SMP system, dual-cores or at least hyperthreading in
first place. On the other hand, people who use very low-end hardware
to do demanding tasks are very likely to build their own kernel
anyway.
> Software optimized for the Opteron will
> often perform much slower on Xeon P4 hardware as a result. P3 hardware tends
> to behave a lot more like Opteron in terms of speed of insturctions relating
> to disabling interrupts, where on P4 Xeon they are proprtionally much slower.
> The critical section optimizations made by John Baldwin, and the movement to
> critical sections in UMA and kernel malloc that I made, made a big performance
> difference on Xeon P4 hardware, but relatively little difference on
> Opteron.
One thing I noticed is that anything P4-based is very sensitive to
spinlocks being placed on the same cache line as the data it protects.
Putting a lock into a struct without cache-line crossing padding means
doom for the P4-based/netburst CPUs (I'm sure it's not a good thing
for Opterons either but they don't seem to mind that much).
Martin
--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer at cons.org> http://www.cons.org/cracauer/
FreeBSD - where you want to go, today. http://www.freebsd.org/
More information about the freebsd-hackers
mailing list