cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c
David Xu
davidxu at freebsd.org
Thu Oct 20 00:39:24 PDT 2005
Bruce Evans wrote:
> On Tue, 18 Oct 2005, Scott Long wrote:
>
> [Excessive quoting retained since I want to comment on separate points.]
>
>> Andrew Gallatin wrote:
>>
>>> Scott Long writes:
>>> > Andrew Gallatin wrote:
>>> > > David Xu [davidxu at FreeBSD.org] wrote:
>>> > > > >>davidxu 2005-10-17 23:10:31 UTC
>>> > >>
>>> > >> FreeBSD src repository
>>> > >>
>>> > >> Modified files:
>>> > >> sys/amd64/amd64 cpu_switch.S machdep.c > >> Log:
>>> > >> Micro optimization for context switch. Eliminate code for
>>> saving gs.base
>>> > >> and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
>>> > >> when user wants to set them, in context switch routine, we
>>> only need to
>>> > >> write them into registers, we never have to read them out from
>>> registers
>>> > >> when thread is switched away. Since rdmsr is a serialization
>>> instruction,
>>> > >> micro benchmark shows it is worthy to do.
>
>
>>> > > > > > > Nice. This reduces lmbench context switch latency by
>>> about 0.4us (7.2
>>> > > -> 6.8us), and reduces TCP loopback latency by about 0.9us (36.1 ->
>>> > > 35.2) on my dual core 3800+
>
>
> I wonder if this reduces the context switch latency from about 1.320
> usec to 0.900 usec on my A64-3000. The latency is only .520 usec in
> i386 mode. I use a TSC timecounter of course.
>
> The fastest loopback latency that I've seen is 5.638 usec under
> Linux-2.2.9 on the same machine. In Linux-2.6.10, it has regressed
> to 17.1 usec. In FreeBSD last year, it was 10.8 usec on the same
> machine in i386 mode and 19.0 in amd64 mode. So the A64 can almost
> keep up with an AXP-1400 running a pre-SMPng version of FreeBSD where
> it was 9.94 usec.
we can avoid reloading userland GS.base MSR and FS.base MSR for system
threads, I am not sure if it can reduce interrupt thread latency.
David Xu
More information about the cvs-src
mailing list