Re: [Mingw-users] [Mingw-w64-public] Math library discrepancies that surprised me.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

2011/4/30 K. Frank <kfr...@gm...>:
> Hello Kai!
>
> On Sat, Apr 30, 2011 at 5:51 AM, Kai Tietz <kti...@go...> wrote:
>> 2011/4/30 Kai Tietz <kti...@go...>:
>>> 2011/4/30 Greg Chicares <gch...@sb...>:
>>>> On 2011-04-29 19:55Z, K. Frank wrote:
>>>>>
>>>>> By the way, could someone confirm that mingw does use msvcrt for
>>>>> sqrt and pow?
>>>>
>>>> http://cygwin.com/cgi-bin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src
>>>>
>>>> Looks to me like sqrt(double) calls into msvcrt. But pow() is a
>>>> different story. Danny implemented it in libmingwex because of
>>>> some quality-of-implementation problem with msvcrt; then people
>>>> complained that the libmingwex version was slower, and I can't
>>>> remember where it wound up, but it's all in the archives.
>>>>
>>>>> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required
>>>>> to by IEEE-754).
>>>> ...
>>>> Not so many years ago (perhaps when msvcrt was written), it was
>>>> thought that correctly-rounded transcendental functions weren't
>>>> feasible in practice, so library authors had a looser attitude
>>>> toward exactness. If you have an ideal implementation of sqrt()
>>>> but a pow() that's only approximate, should you special-case
>>>> the power of exactly 0.5? If you do, then pow() and sqrt() will
>>>> be consistent, but then the powers
>>>>  x^(0.5*(1-epsilon))
>>>>  x^(0.5            )
>>>>  x^(0.5*(1+epsilon))
>>>> might not be monotone. Some would call that a bad tradeoff.
>>>
>>> Well, I spent some efforts into log, exp, and pow implementation. Also
>>> some other base-math functions I improved in mingw-w64's math library
>>> in two terms. a) Make those functions behave as ISO-C99 specification
>>> tells in corner-cases. and b) Improve accuracy in terms of gcc's
>>> internal used gmp. As we noticed here result differences.
>>>
>>> The following example illustrates the issues pretty well:
>>>
>>> #include <stdio.h>
>>> #include <math.h>
>>>
>>> double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } };
>>> long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } };
>>> int main()
>>> {
>>>  double r[3];
>>>  long double rl[3];
>>>  int i;
>>>  for (i = 0; i < 3; i++)
>>>  __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1],
>>>         (r[i] = pow (abc[i][0], abc[i][1])));
>>>  __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0,
>>> 3.0), pow (4.0, 0.5));
>>>  r[0] -= pow (2.2, 3.1);
>>>  r[1] -= pow (2.0, 3.0);
>>>  r[2] -= pow (4.0, 0.5);
>>>  __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]);
>>>
>>>  for (i = 0; i < 3; i++)
>>>  __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1],
>>>         (rl[i] = powl (ab[i][0], ab[i][1])));
>>>  __mingw_printf ("%.20LF %.20LF %.20LF\n",
>>>                         powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl
>>> (4.0L, 0.5L));
>>>  rl[0] -= powl (2.2L, 3.1L);
>>>  rl[1] -= powl (2.0L, 3.0L);
>>>  rl[2] -= powl (4.0L, 0.5L);
>>>  __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]);
>>>  return 1;
>>> }
>
> Do I understand your point correctly that you are comparing gcc's
> compile-time and run-time values for pow?
>
> For the record, when I run this on a mingw64 build:
>
>   gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902 - mingw-w64/oz]
>
> I get:
>
>   C:\>gcc -o pow_test pow_test.c
>
>   C:\>pow_test
>   2.20000000000000017764^3.10000000000000008882 = 11.52153412678571875460
>   2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000
>   4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000
>   11.52153412678571697825 8.00000000000000000000 2.00000000000000000000
>   0.00000000000000177636 0.00000000000000000000 0.00000000000000000000
>   2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414718
>   2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000
>   4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000
>   11.52153412678571414718 8.00000000000000000000 2.00000000000000000000
>   0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
>
> As I understand it, this means I am seeing a small difference between
> compile and run-time evaluation only for powl (2.2, 3.1), i.e., the
> long-double case.
>
>>> The interesting issue is here that gcc uses for constant
>>> math-calculations gmp-library to get result. For more complex
>>> variants, it uses crt's math.  By this it is important for a gcc based
>>> runtime to be in the IEEE 754 floating point compatible to gmp's
>>> results.
>
> Yes, that's a good point.  I hadn't considered the issue of compile-time
> evaluation, and, ideally, it should agree with run-time evaluation, adding
> yet another constraint to our wish list.
>
> Furthermore, if we take the idea of a cross-compiler seriously, we
> would also like the compile-time evaluations of a cross-compiler and
> native compiler to agree.
>
> Hmm...
>
> So, we would like sqrt (x) and pow (x, 0.5) to agree.  We would like
> compile-time and run-time evaluations to agree.  We would like
> cross-compilers and native compilers to agree.
>
> Taken at face value, this would seem to preclude using the platform
> library and platform hardware to compute things like sqrt and pow
> (unless we know they comply with some agreed-upon standard,
> such as IEEE-754).
>
> Just to make the reasoning explicit, if I perform my run-time evaluation
> on platform A, e.g., using platform A's FPU, then a B-->A cross-compiler
> running on platform B won't have platform A's FPU available to perform
> consistent compile-time evaluations.
>
> So you either give up some of the above-wished-for consistency, you
> forgo the potential performance benefits of using the platform's library
> and / or hardware, or you get everybody to use a common standard.
>
>>> ...
>>> I admit that this might be not that obvious to users, why results here
>>> are different, but for a gcc based toolchain we need to play nice with
>>> gcc's internal assumptions.
>>>
>>> Regards,
>>> Kai
>>
>> On closer look for long double, we can have rouding issues for long
>> double precission. I see that powl (0.01L, 0.5L) we have a rounding
>> difference to sqrtl (0.01L). Interestingly not for float or double
>> types.  So we we might need to special-case in pow the case for y ==
>> 0.5 to solve this.
>
> Be, however, cognizant of Greg's concern.  Forcing pow to agree with
> sqrt just for y = 0.5 has the potential to make pow, in a sense, internally
> inconsistent,  It's a tricky trade-off.  (Ideally, both pow and sqrt would
> be "exact" and therefore agree with one another automatically.)
>
> By the way, I also see a problem with run-time powl for powl (2.2L, 0.5L).
> It disagrees slightly from sqrtl (2.2L) and from the compile-time value
> for powl (2.2L, 0.5L).
>
> I modified your test program to compare sqrt with pow (x, 0.5) (and sqrtl
> with powl).  For the record:
>
>   C:\>gcc --version
>   gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902 - mingw-w64/oz]
>
>   C:\>gcc -o sqrt_pow sqrt_pow.c
>
>   C:\>sqrt_pow
>   1.04880884817015163080, 1.04880884817015163080, 0.00000000000000000000
>   1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000000
>   1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000000
>   1.04880884817015163080 1.04880884817015163080
>   1.48323969741913264109 1.48323969741913264109
>   1.81659021245849494619 1.81659021245849494619
>   0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
>   0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
>   1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000
>   1.48323969741913258980, 1.48323969741913258970, -0.00000000000000000011
>   1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000
>   1.04880884817015154699 1.04880884817015154699
>   1.48323969741913258980 1.48323969741913258980
>   1.81659021245849499921 1.81659021245849499921
>   0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
>   0.00000000000000000000 -0.00000000000000000011 0.00000000000000000000
>
> where sqrt_pow.c is:
>
>   #include <stdio.h>
>   #include <math.h>
>
>   double abc[3] = { 1.1, 2.2, 3.3 };
>   long double abcl[3] = { 1.1L, 2.2L, 3.3L };
>   int main()
>   {
>    double r1[3], r2[3], r3[3];
>    long double rl1[3], rl2[3], rl3[3];
>    int i;
>    for (i = 0; i < 3; i++)
>    __mingw_printf ("%.20F, %.20F, %.20F\n",
>    (r1[i] = sqrt (abc[i])),
>    (r2[i] = pow (abc[i], 0.5)),
>    (r3[i] = pow (abc[i], 0.5) - sqrt (abc[i])));
>    __mingw_printf ("%.20F %.20F\n", sqrt (1.1), pow (1.1, 0.5));
>    __mingw_printf ("%.20F %.20F\n", sqrt (2.2), pow (2.2, 0.5));
>    __mingw_printf ("%.20F %.20F\n", sqrt (3.3), pow (3.3, 0.5));
>    r1[0] -= sqrt (1.1);
>    r1[1] -= sqrt (2.2);
>    r1[2] -= sqrt (3.3);
>    __mingw_printf ("%.20F %.20F %.20F\n", r1[0], r1[1], r1[2]);
>    r2[0] -= pow (1.1, 0.5);
>    r2[1] -= pow (2.2, 0.5);
>    r2[2] -= pow (3.3, 0.5);
>    __mingw_printf ("%.20F %.20F %.20F\n", r2[0], r2[1], r2[2]);
>
>    for (i = 0; i < 3; i++)
>    __mingw_printf ("%.20LF, %.20LF, %.20LF\n",
>    (rl1[i] = sqrtl (abcl[i])),
>    (rl2[i] = powl (abcl[i], 0.5L)),
>    (rl3[i] = powl (abcl[i], 0.5L) - sqrtl (abcl[i])));
>    __mingw_printf ("%.20LF %.20LF\n", sqrtl (1.1L), powl (1.1L, 0.5L));
>    __mingw_printf ("%.20LF %.20LF\n", sqrtl (2.2L), powl (2.2L, 0.5L));
>    __mingw_printf ("%.20LF %.20LF\n", sqrtl (3.3L), powl (3.3L, 0.5L));
>    rl1[0] -= sqrtl (1.1L);
>    rl1[1] -= sqrtl (2.2L);
>    rl1[2] -= sqrtl (3.3L);
>    __mingw_printf ("%.20LF %.20LF %.20LF\n", rl1[0], rl1[1], rl1[2]);
>    rl2[0] -= powl (1.1L, 0.5L);
>    rl2[1] -= powl (2.2L, 0.5L);
>    rl2[2] -= powl (3.3L, 0.5L);
>    __mingw_printf ("%.20LF %.20LF %.20LF\n", rl2[0], rl2[1], rl2[2]);
>
>    return 1;
>   }
>
>>
>> Kai
>
> Thanks, Kai, for your insight and explanations.
>
>
> K. Frank

I add a special case to mingw-w64's pow(f|l) function about y=0.5 and
get now for your adjusted testcase the following results, which are
looking much better IMHO.

$ ./tst.exe
1.04880884817015163080, 1.04880884817015163080, 0.00000000000000004142
1.48323969741913264109, 1.48323969741913264109, -0.00000000000000000857
1.81659021245849494619, 1.81659021245849494619, -0.00000000000000000412
1.04880884817015163080 1.04880884817015163080
1.48323969741913264109 1.48323969741913264109
1.81659021245849494619 1.81659021245849494619
0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000
1.48323969741913258980, 1.48323969741913258980, 0.00000000000000000000
1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000
1.04880884817015154699 1.04880884817015154699
1.48323969741913258980 1.48323969741913258980
1.81659021245849499921 1.81659021245849499921
0.00000000000000000000 0.00000000000000000000 0.00000000000000000000
0.00000000000000000000 0.00000000000000000000 0.00000000000000000000

I used for this a 32-bit variant and 64-bit variant, which made no
differences in output.
The issue here is that 1.0 branch still uses the inaccurate variant
from Danny, but nowaday's trunk uses for pow, sqrt, log, and exp
functions new implementations, which are treating rounding and
short-path calculation in an ISO-C variant (as gcc expects internally
such a math). Also complex-math part is completely rewritten on
mingw-w64's trunk version.

IMHO it might be a good idea to replace step by step the classical
runtime functions, so that they are not using here msvcrt in first
place. Of course the import-symbols of msvcrt (for compatibility with
VC generated objects need to remain in .def exports, but the actual
functions should come directly from mingwex.

Re: [Mingw-users] [Mingw-w64-public] Math library discrepancies that surprised me.

A native Windows port of the GNU Compiler Collection (GCC)

Re: [Mingw-users] [Mingw-w64-public] Math library discrepancies that surprised me.