cvs commit: src/lib/msun/src e_rem_pio2f.c

Sat Oct 8 15:43:56 PDT 2005

bde         2005-10-08 22:43:55 UTC

  FreeBSD src repository

  Modified files:
    lib/msun/src         e_rem_pio2f.c 
  Log:
  Fixed range reduction near (but not very near) +-pi/2.  A bug caused
  a maximum error of 2.905 ulps for cosf(), but the algorithm for cosf()
  is good for < 1 ulps and happens to give perfect rounding (< 0.5 ulps)
  near +-pi/2 except for the bug.  The extra relative errors for tanf()
  were similar (slightly larger).  The bug didn't affect sinf() since
  sinf'(+-pi/2) is 0.

  For range reduction in ~[-3pi/4, -pi/4] and ~[pi/4, 3pi/4] we must
  subtract +-pi/2 and the only complication is that this must be done
  in extra precision.  We have handy 17+24-bit and 17+17+24-bit
  approximations to pi/2.  If we always used the former then we would
  lose up to 24 bits of accuracy due to cancelation of leading bits, but
  we need to keep at least 24 bits plus a guard digit or 2, and should
  keep as many guard bits as efficiency permits.  So we used the
  less-precise pi/2 not very near +-pi/2 and switched to using the
  more-precise pi/2 very near +-pi/2.  However, we got the threshold for
  the switch wrong by allowing 19 bits to cancel, so we ended up with
  only 21 or 22 bits of accuracy in some cases, which is even worse than
  naively subtracting pi/2 would have done.

  Exhaustive checking shows that allowing only 17 bits to cancel (min.
  accuracy ~24 bits) is sufficient to reduce the maximum error for cosf()
  near +-pi/2 to 0.726 ulps, but allowing only 6 bits to cancel (min.
  accuracy ~35-bits) happens to give perfect rounding for cosf() at
  little extra cost so we prefer that.

  We actually (in effect) allow 0 bits to cancel and always use the
  17+17+24-bit pi/2 (min. accuracy ~41 bits).  This is simpler and
  probably always more efficient too.  Classifying args to avoid using
  this pi/2 when it is not needed takes several extra integer operations
  and a branch, but just using it takes only 1 FP operation.

  The patch also fixes misspelling of 17 as 24 in many comments.

  For the double-precision version, the magic numbers include 33+53 bits
  for the less-precise pi/2 and (53-32-1 = 20) bits being allowed to
  cancel, so there are ~33-20 = 13 guard bits.  This is sufficient except
  probably for perfect rounding.  The more-precise pi/2 has 33+33+53
  bits and we still waste time classifying args to avoid using it.

  The bug is apparently from mistranslation of the magic 32 in 53-32-1.
  The number of bits allowed to cancel is not critical and we use 32 for
  double precision because it allows efficient classification using a
  32-bit comparison.  For float precision, we must use an explicit mask,
  and there are fewer bits so there is less margin for error in their
  allocation.  The 32 got reduced to 4 but should have been reduced
  almost in proportion to the reduction of mantissa bits.

  Revision  Changes    Path
  1.8       +7 -19     src/lib/msun/src/e_rem_pio2f.c