Fwd: [cfe-dev] More on atlas and clang
David Chisnall
theraven at FreeBSD.org
Mon Mar 11 08:37:35 UTC 2013
Recent benchmarks of Atlas with clang, recently posted to the clang list attached. Note that the -fvectorize and -fslp-vectorize flags are enabling the new autovectorisation code in clang, which will be enabled by default in 3.3.
David
Begin forwarded message:
> Hi there,
>
> I have recently undertaken another experimental build of Atlas (http://math-atlas.sourceforge.net – briefly speaking, Atlas provides a highly complete BLAS/LAPACK implementation optimized for the native architecture of the computer on which it is running) on an AVX machine (MacMini 2011) using a snapshot of clang 3.3 (r173279) provided by MacPorts (http://macports.org), with -O3, -fPIC, -fvectorize and -fslp-vectorize flags.
>
> I am please to say that:
>
> 1. The generated AVX code seems fine: a full test session run under an Atlas-based SciPy didn’t raise any error;
> 2. The performance seems now on-par or even (sometimes surprisingly) better than the ‘reference GCC’ – whatever that means (I was unable to get in touch with Atlas developer at that time) – as evidenced by the table below:
>
> Reference clock rate=3292Mhz, new rate=2300Mhz
> Refrenc : % of clock rate achieved by reference install
> Present : % of clock rate achieved by present ATLAS install
>
> single precision double precision
> ******************************** *******************************
> real complex real complex
> --------------- --------------- --------------- ---------------
> Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc Present
> ========= ======= ======= ======= ======= ======= ======= ======= =======
> kSelMM 1289.9 1407.4 1188.7 1229.8 686.7 826.8 647.4 682.1
> kGenMM 198.2 239.7 198.5 237.8 193.9 231.8 196.0 233.8
> kMM_NT 193.7 266.4 195.2 192.9 184.2 187.4 188.5 197.5
> kMM_TN 198.5 211.1 197.9 226.2 189.8 227.6 189.5 223.2
> BIG_MM 1213.8 1346.7 1241.3 1366.5 652.0 789.5 661.4 795.8
> kMV_N 224.3 308.1 438.8 617.3 115.9 152.1 205.8 283.5
> kMV_T 224.6 313.5 460.3 642.9 123.2 159.6 211.3 288.2
> kGER 148.3 192.4 290.2 381.2 73.3 95.6 144.3 184.3
>
> This is in stark contrast with the previous test where clang were lagging about 20% beyond the ‘reference implementation’ based on GCC for lines 2, 3 and 4 where compiler performance matters most.
>
> So – to summarize in two words: kudos folks!
>
> I will build another version on a Core2Duo machine tonight and see if the results are consistent.
>
> Cheers!
> Vincent
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
More information about the freebsd-numerics
mailing list