[tjr@FreeBSD.org: cvscommit:src/lib/libpthread/arch/amd64/amd
64 context.S]
Tim Robbins
tjr at freebsd.org
Tue Jun 8 02:07:55 GMT 2004
On Tue, Jun 08, 2004 at 09:17:41AM +0800, David Xu wrote:
> Peter Wemm wrote:
>
> >On Monday 07 June 2004 06:10 pm, David Xu wrote:
> >
> >>Is there any reason to use memory indirect jump ? did you
> >>have benchmarked context switch speed before and after this commit ?
> >>I won't use such indirect jump in speed sensitive case, it is
> >>not CPU branch trace cache friendly, it is better to use
> >>ret to match call in up level.
> >
> >
> >Because the return address is already on the higher level stack frame,
> >and copying it (read/write/ret) is more awkward than the read+indirect
> >jump. Unfortunately, we can't indirectly access the flags register.
> >
> I would like someone to test it:
> http://people.freebsd.org/~davidxu/kse/test/ctxswitch.c
> tell me the result before and after this commit.
System: AMD Athlon 64 3000+, ASUS K8V, FreeBSD 5.2-tjr_perf with kernel
config tuned for performance (no INVARIANTS, no WITNESS), multiuser mode,
XFree86 + GNOME running. Test program compiled with gcc -O2 -pthread -static.
ctxold = old (broken) code using ret
ctxnew = new correct code using indirect jump
ctxopt = same as ctxnew but does not save scratch registers or flags,
redundant checks removed, jumps aligned to dword boundary
$ time ./ctxold; time ./ctxnew; time ./ctxopt
testing scope process context switch speed...
context switches:1779631/s
testing scope system context switch speed...
context switches:386696/s
21.01s real 10.40s user 9.49s system
testing scope process context switch speed...
context switches:1823471/s
testing scope system context switch speed...
context switches:383949/s
21.00s real 10.34s user 9.55s system
testing scope process context switch speed...
context switches:1864775/s
testing scope system context switch speed...
context switches:386127/s
21.01s real 10.42s user 9.48s system
Tim
More information about the freebsd-amd64
mailing list