cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c
Andrew Gallatin
gallatin at cs.duke.edu
Tue Oct 18 07:25:21 PDT 2005
Scott Long writes:
> Andrew Gallatin wrote:
> > David Xu [davidxu at FreeBSD.org] wrote:
> >
> >>davidxu 2005-10-17 23:10:31 UTC
> >>
> >> FreeBSD src repository
> >>
> >> Modified files:
> >> sys/amd64/amd64 cpu_switch.S machdep.c
> >> Log:
> >> Micro optimization for context switch. Eliminate code for saving gs.base
> >> and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
> >> when user wants to set them, in context switch routine, we only need to
> >> write them into registers, we never have to read them out from registers
> >> when thread is switched away. Since rdmsr is a serialization instruction,
> >> micro benchmark shows it is worthy to do.
> >
> >
> > Nice. This reduces lmbench context switch latency by about 0.4us (7.2
> > -> 6.8us), and reduces TCP loopback latency by about 0.9us (36.1 ->
> > 35.2) on my dual core 3800+
> >
> > It is a shame we can't find a way to use the TSC as a timecounter on
> > SMP systems. It seems that about 40% of the context switch time is
> > spent just waiting for the PIO read of the ACPI-fast or i8254 to
> > return.
> >
> >
> > Drew
> >
> >
> >
>
> The TSC represents the clock rate of the CPU, and thus can vary wildly
> when thermal and power management controls kick in, and there is no way
> to know when it changes. Because of this, I think that it's
> practically useless on Pentium-Mobile and Pentium-M chips, among many
> others. There is also the issue of multiple CPUs having to keep their
> TSC's somewhat in sync in order to get consistent counting in the
> system. The best that you can do is to periodically read a stable
> counter and try to recalibrate, but then you'll likely start getting
> wild operational variances.
As I pointed out in another thread, both linux and solaris do it.
Solaris seems to have a nice algorithm for keeping things in sync, and
accounting for the TSC getting cleared after suspend/resume etc. At
my level of understanding, this argument is nothing more than "but
Mom, all the other kids are doing it". I was just hoping that
somebody with real understanding could pick up on it.
> It's a shame that a PIO read is still so
> expensive. I'd hate to see just how bad your benchmark becomes when
> ACPI-slow is used instead of ACPI-fast.
It seems like reading ACPI-fast is "only" 3us or so, but when the ctx
switch is otherwise 4us, it adds up. i8254 is much worse on this
system (6.5us).
> I wonder if moving to HZ=1000 on amd64 and i386 was really all that good
> of an idea. Having preemption in the kernel means that ithreads can run
> right away instead of having to wait for a tick, and various fixes to
> 4BSD in the past year have eliminated bugs that would make the CPU wait
> for up to a tick to schedule a thread. So all we're getting now is a
> 10x increase in scheduler overhead, including reading the timecounters.
Yeah. I moved my back to hz=1000 when I noticed 4000 interrupts/sec
on an idle system.
Drew
More information about the cvs-src
mailing list