amd64/156464: fpsetprec does not work

Mon Apr 18 22:20:10 UTC 2011

The following reply was made to PR amd64/156464; it has been noted by GNATS.

From: Bruce Evans <brde at optusnet.com.au>
To: Michirou & <yabuki at fuchan.myaw.ei.meisei-u.ac.jp>
Cc: FreeBSD-gnats-submit at FreeBSD.org, freebsd-amd64 at FreeBSD.org
Subject: Re: amd64/156464: fpsetprec does not work
Date: Tue, 19 Apr 2011 06:02:02 +1000 (EST)

 On Mon, 18 Apr 2011, Michirou & wrote:

 >> Description:
 >
 > In default, fpgetprec() returns FP_PE, but results show FP_PD.
 > if fpsetprec(FP_PE) is called, results are never changed.

 amd64 uses SSE except for long doubles, so fpsetprec() and no effect
 on the results for long doubles.  Since the precision defaults to
 FP_PE on amd64, fpsetprec() can only be used to break long doubles
 on amd64, while on i386 the precision defaults to FP_PD and fpsetprec()
 is needed to unbreak this.  fpsetprec() on i386 can also be used to:
 - break doubles by setting the precision to FP_PS
 - reduce the precision for floats by setting the precision to FP_PS.
    This is sometimes useful for getting the same precision for floats
    as on other arches like amd64, to test that nothing depends on the
    extra precision without being ifdefed for this.
 - give increased precision for floats and doubles by setting the
    precision to FP_PE.  This may be useful, but is difficult to
    program.  It requires almost never actually using floats or
    doubles, except for converting them to and from long double on
    input and output.

 > This is not happen on FreeBSD8.2-RELEASE i386 version.

 amd64 behaviour in this area hasn't changed.

 >> How-To-Repeat:
 >
 > #include <stdio.h>
 > #include <stdlib.h>
 > #include <machine/ieeefp.h>
 > int main()
 > {
 >    double a, b, c, d;

 This only uses doubles, so fpsetprec() has no effect on it.

 >
 >    printf("fpgetprec %d\n",   fpgetprec()); // 3 on amd64, 2 on i386
 >
 >    a = 10.0;
 >    b = 2.718281810;
 >    c = a / (b * b);
 >    printf("%20.16e\n",   c);  // 1.3533528507465618e+00 on both
 >
 >    fpsetprec(FP_PE);

 It is still 3 on amd64, but is not used for doubles.  It was changed
 from 2 to 3 on i386.

 >    a = 10.0;
 >    b = 2.718281810;
 >    c = a / (b * b);
 >    printf("%20.16e\n",   c);
 >              // 1.3533528507465618e+00 on amd64
 > 	      // 1.3533528507465620e+00 on i386

 So result is more accurate on i386, but this behaviour is fragile and
 requires more care to program than the above in general.  With FP_PE
 on i386, b*b is evaluated in extra precision, but there is nothing
 to prevent it being stored to memory, which would lose its extra
 precision, especially since gcc doesn't understand precision stuff.
 In practice, gcc won't store to memory in the middle of a simple
 expression like the above, even with -O0, so the above works like
 you want.  The careful version is:

  	a = 10.0;
  	b = 2.718281810;

  	long double la, lb;

  	la = a;
  	lb = b;
  	c = la / (lb * lb);	/* compiler bugs -- extra precision not lost
  				 * yet unless there is an acidental or
  				 * forced store (-ffloat-store) */
  	printf("%20.16e\n",   c);  /* ABI gives a store which loses the bugs
  				    * so we see only double precision for
  				    * the result */

 An even more careful version to avoid the compiler bugs by forcing a store
 for this variable only is:

  	...
  	volatile double vc;

  	vc = la / (lb * lb);
  	c = vc;			/* c reduced to double prec -- now ready for
  				 * output, but probably not useful for
  				 * furthe calculations */

 -ffloat-store should never be used since it pessimizes speed and precision
 globally.

 >
 >    exit(0);
 > }

 fpsetprec() is very unportable due to its only affecting the i387 register
 set.  Even on i386, you can break its effect on doubles by using '-msse2
 -mfpmath=sse'.  This bug is the default for clang.

 Bruce