cvs commit: src/lib/msun/src e_rem_pio2f.c s_cosf.c s_sinf.c
s_tanf.c
Bruce Evans
bde at FreeBSD.org
Fri Nov 18 18:38:28 PST 2005
bde 2005-11-19 02:38:27 UTC
FreeBSD src repository
Modified files:
lib/msun/src e_rem_pio2f.c s_cosf.c s_sinf.c s_tanf.c
Log:
Moved all the optimizations for |x| <= 9pi/2 from
__ieee754_rem_pio2f() to its 3 callers and manually inline them.
On Athlons, with favourable compiler flags and optimizations and
favourable pipeline conditions, this gives a speedup of 30-40 cycles
for cosf(), sinf() and tanf() on the range pi/4 < |x| <= 9pi/4, so
thes functions are now signifcantly faster than the hardware trig
functions in many cases. E.g., in a benchmark with uniformly distributed
x in [-2pi, 2pi], A64 hardware fcos took 72-129 cycles and cosf() took
37-55 cycles. Out-of-order execution is needed to get both of these
times. The optimizations in this commit apparently work more by
removing 1 serialization point than by reducing latency.
Revision Changes Path
1.17 +0 -55 src/lib/msun/src/e_rem_pio2f.c
1.10 +33 -2 src/lib/msun/src/s_cosf.c
1.10 +41 -4 src/lib/msun/src/s_sinf.c
1.10 +31 -6 src/lib/msun/src/s_tanf.c
More information about the cvs-all
mailing list