Compiler performance tests on FreeBSD 10.0-CURRENT
Steve Kargl
sgk at troutmask.apl.washington.edu
Wed Sep 5 22:13:23 UTC 2012
On Wed, Sep 05, 2012 at 11:31:26AM +0200, Dimitry Andric wrote:
> On 2012-09-05 01:40, Garrett Cooper wrote:
> ...
> > Steve does have a point. Posting the results of
> >CFLAGS/CPPFLAGS/LDFLAGS/etc for config.log (and maybe poking through
> >the code to figure out what *FLAGS were used elsewhere) is more
> >valuable than the data is in its current state (unfortunately..
> >autoconf makes things more complicated).
>
> 1) For building the FreeBSD in-tree version of clang 3.2:
>
> -O2 -pipe -fno-strict-aliasing
>
> 2) For building the FreeBSD in-tree version of gcc 4.2.1:
>
> -O2 -pipe
>
> 3) For building Boost 1.50.0:
>
> -ftemplate-depth-128 -O3 -finline-functions
>
Dimitry thanks for the follow-up. I performed an unscientific
(micro)benchmark of /usr/bin/cc vs /usr/bin/clang where cc is
the base system's gcc 4.2.1. Here's what I found/feared.
Compiling libm on
CPU: AMD Opteron(tm) Processor 248 (2192.01-MHz K8-class CPU)
Origin = "AuthenticAMD" Id = 0xf5a Family = f Model = 5 Stepping = 10
Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,\
MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!>
with default CFLAGS (ie., -O2 -pipe) and -march=opteron.
Using 'setenv CC /usr/bin/cc' with 3 runs of
make clean
time make -DNO_MAN
yields
69.39 real 52.00 user 38.55 sys
69.57 real 52.35 user 38.37 sys
69.48 real 52.25 user 38.38 sys
Now, repeating with 'setenv CC /usr/bin/clang' yields
39.65 real 21.86 user 17.37 sys
40.91 real 21.48 user 17.91 sys
39.77 real 21.65 user 17.64 sys
So, clang does appear to be faster in this particular
compiling speed benchmark.
However, if I know build my test program for libm's j0f()
function where the only difference is whether libm was
built with /usr/bin/cc or /usr/bin/clang, I observe the
following results.
1234567 x values in the interval [0:25]
gcc libm | clang libm
----------------|-----------------
ULP <= 0.6 --> 565515 (45.81%) | 513763 (41.61%)
0.6 < ULP <= 0.7 --> 74148 ( 6.01%) | 67221 ( 5.44%)
0.7 < ULP <= 0.8 --> 69112 ( 5.60%) | 62846 ( 5.09%)
0.8 < ULP <= 0.9 --> 63798 ( 5.17%) | 58217 ( 4.72%)
0.9 < ULP <= 1.0 --> 58679 ( 4.75%) | 53834 ( 4.36%)
1.0 < ULP <= 2.0 --> 328221 (26.59%) | 306728 (24.84%)
2.0 < ULP <= 3.0 --> 65323 ( 5.29%) | 63452 ( 5.14%)
3.0 < ULP --> 9771 ( 0.79%) | 108506 ( 8.79%)
gcc libm | clang libm
-----------------------|--------------------
MAX ULP: 12152.27637 | 1129606938624.00000
x at MAX ULP: 5.520077 0x1.6148f2p+2 | 2.404833 0x1.33d19p+1
Speed test with gcc libm.
1234567 j0f calls in 0.193427 seconds.
1234567 j0f calls in 0.193410 seconds.
1234567 j0f calls in 0.194158 seconds.
Speed test with clang libm.
1234567 j0f calls in 0.180260 seconds.
1234567 j0f calls in 0.180130 seconds.
1234567 j0f calls in 0.179739 seconds.
So, although the clang built j0f() appears to be faster than
the gcc built j0f(), the clang built j0f() has much worse
accuracy issues.
--
Steve
More information about the freebsd-current
mailing list