From: Arthur Norman <acn1@ca...>  20110429 13:54:40

The following short program compares the value of pow(x, 0.5) to that returned by sqrt(x) for a range of values of x, and prints out any first case where the two do not agree. With the mingw family of compilers I had expected the Microsoft C library to be in use. What I appear to see is linux (ubuntu 11.04 64bit tested) no output Mac (snow leopard) no output gcc (cygwin) no output gcc3 mnocygwin ) i686w64mingw32gcc ) all different from each other x86_64w64mingw32gcc ) but all find discrepancies. It is a totally fair cop if the Microsoft C library gives different results in the last bit for floating point elementary functions to gnu libraries, but is there an obvious reason why the three mingw varients all give different answers. This showed up in regression testing something much larger across platforms. Arthur ==================== #include <stdio.h> #include <math.h> int main(int argc, char *argv[]) { double a, b, c; int i; for (i=0, a=1.0; i<10000000; i++) { a *= 1.00000001; b = sqrt(a); c = pow(a, 0.5); if (b == c) continue; printf("%.17g %.17g %.17g\n", a, b, c); return 0; } return 0; } ==================== acn1@... temp $ gcc expt.c o expt acn1@... temp $ ./expt acn1@... temp $ gcc3 mnocygwin expt.c o expt acn1@... temp $ ./expt 1.0241210164238141 1.011988644414459 1.0119886444144588 acn1@... temp $ i686w64mingw32gcc expt.c o expt acn1@... temp $ ./expt 1.0004603659307831 1.000230156479389 1.0002301564793892 acn1@... temp $ x86_64w64mingw32gcc expt.c o expt acn1@... temp $ ./expt 1.0000179101601867 1.0000089550399969 1.0000089550399971 acn1@... temp ============================================================== 
From: K. Frank <kfrank29.c@gm...>  20110429 19:56:00

Hello Arthur! (I've take the liberty of copying this to the mingw list, as well.) On Fri, Apr 29, 2011 at 9:54 AM, Arthur Norman wrote: > The following short program compares the value of pow(x, 0.5) to that > returned by sqrt(x) for a range of values of x, and prints out any first > case where the two do not agree. Well, this is unfortunate. Ideally they should, and as a qualityofimplementation issue, I think we should be able to expect them to agree. > With the mingw family of compilers I had > expected the Microsoft C library to be in use. > > What I appear to see is > linux (ubuntu 11.04 64bit tested) no output > Mac (snow leopard) no output > gcc (cygwin) no output > gcc3 mnocygwin ) > i686w64mingw32gcc ) all different from each other > x86_64w64mingw32gcc ) but all find discrepancies. Very interesting. > It is a totally fair cop if the Microsoft C library gives different > results in the last bit for floating point elementary functions to gnu > libraries, Well, from the mingw perspective, if the microsoft runtime library is flawed in this way, then it's a legitimate cop for mingw not to fix it. By the way, could someone confirm that mingw does use msvcrt for sqrt and pow? > but is there an obvious reason why the three mingw varients > all give different answers. If I understand your point correctly, if the problem is with msvcrt, then the three mingw variants should all give the same flawed results. > This showed up in regression testing something > much larger across platforms. > > Arthur > ... As I understand it, IEEE754 compliance requires sqrt to be calculated "correctly", i.e., to return the floatingpoint number closest to the exact mathematical value. (The c++ standard, I believe, does not, however, require IEEE754 compliance.) I think, however, the IEEE754 does not require all mathematical functions, and, in particular, pow, to be "correct," in the above sense sense. It would be interesting to print out in binary (or hex) the results for a few of the values that lead to discrepancies with both mingw and a linuxbased gcc, and see whether the problem is in sqrt (for which IEEE754 specifies the result) or in pow (for which I think IEEE754 permits some leeway). In my opinion: mingw / msvcrt should be IEEE754 compliant. Of course, if msvcrt isn't, then it's not mingw's fault. sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required to by IEEE754). Browsing some gcc lists, I did see some comments that suggest that gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would make sqrt and pow consistent (whether or not technically correct). Maybe gcc / linux pow and msvcrt pow are both not quite exact, but this is hidden in your test, because gcc / linux replaces pow (x, 0.5) with sqrt (x), and msvcrt doesn't. (Just speculation.) It is odd, however, that the various mingw's don't agree, and this speaks to a possible bug in mingw. Thanks. K. Frank > ... > ==================== > #include <stdio.h> > #include <math.h> > > int main(int argc, char *argv[]) > { > double a, b, c; > int i; > for (i=0, a=1.0; i<10000000; i++) > { a *= 1.00000001; > b = sqrt(a); c = pow(a, 0.5); > if (b == c) continue; > printf("%.17g %.17g %.17g\n", a, b, c); > return 0; > } > return 0; > } > ==================== > > acn1@... temp > $ gcc expt.c o expt > acn1@... temp > $ ./expt > acn1@... temp > $ gcc3 mnocygwin expt.c o expt > acn1@... temp > $ ./expt > 1.0241210164238141 1.011988644414459 1.0119886444144588 > acn1@... temp > $ i686w64mingw32gcc expt.c o expt > acn1@... temp > $ ./expt > 1.0004603659307831 1.000230156479389 1.0002301564793892 > acn1@... temp > $ x86_64w64mingw32gcc expt.c o expt > acn1@... temp > $ ./expt > 1.0000179101601867 1.0000089550399969 1.0000089550399971 > acn1@... temp > ============================================================== 
From: Greg Chicares <gchicares@sb...>  20110429 23:38:10

On 20110429 19:55Z, K. Frank wrote: > > By the way, could someone confirm that mingw does use msvcrt for > sqrt and pow? http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src Looks to me like sqrt(double) calls into msvcrt. But pow() is a different story. Danny implemented it in libmingwex because of some qualityofimplementation problem with msvcrt; then people complained that the libmingwex version was slower, and I can't remember where it wound up, but it's all in the archives. > sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required > to by IEEE754). If they should give exactly the same result always, then the first thing that needs to be fixed is any standard that allows them to differ. The language standards are so much looser than the numerical standard, even now, that I'd rather see C and C++ catch up to IEEE7541985 before asking them to go beyond. > Browsing some gcc lists, I did see some comments that suggest that > gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would > make sqrt and pow consistent (whether or not technically correct). Not so many years ago (perhaps when msvcrt was written), it was thought that correctlyrounded transcendental functions weren't feasible in practice, so library authors had a looser attitude toward exactness. If you have an ideal implementation of sqrt() but a pow() that's only approximate, should you specialcase the power of exactly 0.5? If you do, then pow() and sqrt() will be consistent, but then the powers x^(0.5*(1epsilon)) x^(0.5 ) x^(0.5*(1+epsilon)) might not be monotone. Some would call that a bad tradeoff. 
From: K. Frank <kfrank29.c@gm...>  20110430 00:51:07

Hi Greg! On Fri, Apr 29, 2011 at 7:38 PM, Greg Chicares <gchicares@...> wrote: > On 20110429 19:55Z, K. Frank wrote: >> >> By the way, could someone confirm that mingw does use msvcrt for >> sqrt and pow? > > http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src > > Looks to me like sqrt(double) calls into msvcrt. But pow() is a > different story. Danny implemented it in libmingwex because of > some qualityofimplementation problem with msvcrt; then people > complained that the libmingwex version was slower, and I can't > remember where it wound up, but it's all in the archives. Thanks for that  good to know. From the source link you gave: http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/pow.c?annotate=1.1.10.1&cvsroot=src the code for pow (the regular double pow (double, double)) has these comments about accuracy. 28: * ACCURACY: 29: * 30: * Relative error: 31: * arithmetic domain # trials peak rms 32: * IEEE 26,26 30000 4.2e16 7.7e17 33: * DEC 26,26 60000 4.8e17 9.1e18 34: * 1/26 < x < 26, with log(x) uniformly distributed. 35: * 26 < y < 26, y uniformly distributed. 36: * IEEE 0,8700 30000 1.5e14 2.1e15 37: * 0.99 < x < 1.01, 0 < y < 8700, uniformly distributed. This commentary suggest that pow is not "exact." This would reasonably explain Arthur seeing pow not match sqrt. But it still leaves the puzzle of why the different mingw versions don't behave the same. One minor puzzle: I don't see anything in the pow code that would let the compiler / library take advantage of hardware support for pow, which I think has been available in the x86 architecture for quite some time now. > >> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >> to by IEEE754). > > If they should give exactly the same result always, then the > first thing that needs to be fixed is any standard that allows > them to differ. Well, it's a question of engineering compromise whether they should always agree. In an ideal world, it would be nice if pow were "exact" (in the sense of closet floatingpoint number). Then if pow were "exact," pow and sqrt should agree. But there is expense in getting pow to be "exact," so it's arguable that the language / IEEE standards should give implementers some flexibility. (And, as you point out below, if pow isn't exact, then forcing pow and sqrt to agree could actually cause other problems.) > The language standards are so much looser than > the numerical standard, even now, that I'd rather see C and C++ > catch up to IEEE7541985 before asking them to go beyond. > >> Browsing some gcc lists, I did see some comments that suggest that >> gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would >> make sqrt and pow consistent (whether or not technically correct). > > Not so many years ago (perhaps when msvcrt was written), it was > thought that correctlyrounded transcendental functions weren't > feasible in practice, so library authors had a looser attitude > toward exactness. If you have an ideal implementation of sqrt() > but a pow() that's only approximate, should you specialcase > the power of exactly 0.5? If you do, then pow() and sqrt() will > be consistent, but then the powers > x^(0.5*(1epsilon)) > x^(0.5 ) > x^(0.5*(1+epsilon)) > might not be monotone. Some would call that a bad tradeoff. I actually agree with your concern. The transformation of pow (x, 0.5) to sqrt (x) that it sounded like gcc / linux was doing sounds like a bad idea if it's being done this simply and naively. Any speculation on why Arthur gets different results with different mingw's? Thanks for your perspective. K. Frank 
From: Kai Tietz <ktietz70@go...>  20110430 08:42:15

2011/4/30 Greg Chicares <gchicares@...>: > On 20110429 19:55Z, K. Frank wrote: >> >> By the way, could someone confirm that mingw does use msvcrt for >> sqrt and pow? > > http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src > > Looks to me like sqrt(double) calls into msvcrt. But pow() is a > different story. Danny implemented it in libmingwex because of > some qualityofimplementation problem with msvcrt; then people > complained that the libmingwex version was slower, and I can't > remember where it wound up, but it's all in the archives. > >> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >> to by IEEE754). > > If they should give exactly the same result always, then the > first thing that needs to be fixed is any standard that allows > them to differ. The language standards are so much looser than > the numerical standard, even now, that I'd rather see C and C++ > catch up to IEEE7541985 before asking them to go beyond. > >> Browsing some gcc lists, I did see some comments that suggest that >> gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would >> make sqrt and pow consistent (whether or not technically correct). > > Not so many years ago (perhaps when msvcrt was written), it was > thought that correctlyrounded transcendental functions weren't > feasible in practice, so library authors had a looser attitude > toward exactness. If you have an ideal implementation of sqrt() > but a pow() that's only approximate, should you specialcase > the power of exactly 0.5? If you do, then pow() and sqrt() will > be consistent, but then the powers > x^(0.5*(1epsilon)) > x^(0.5 ) > x^(0.5*(1+epsilon)) > might not be monotone. Some would call that a bad tradeoff. Well, I spent some efforts into log, exp, and pow implementation. Also some other basemath functions I improved in mingww64's math library in two terms. a) Make those functions behave as ISOC99 specification tells in cornercases. and b) Improve accuracy in terms of gcc's internal used gmp. As we noticed here result differences. The following example illustrates the issues pretty well: #include <stdio.h> #include <math.h> double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } }; long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } }; int main() { double r[3]; long double rl[3]; int i; for (i = 0; i < 3; i++) __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1], (r[i] = pow (abc[i][0], abc[i][1]))); __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0, 3.0), pow (4.0, 0.5)); r[0] = pow (2.2, 3.1); r[1] = pow (2.0, 3.0); r[2] = pow (4.0, 0.5); __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]); for (i = 0; i < 3; i++) __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1], (rl[i] = powl (ab[i][0], ab[i][1]))); __mingw_printf ("%.20LF %.20LF %.20LF\n", powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl (4.0L, 0.5L)); rl[0] = powl (2.2L, 3.1L); rl[1] = powl (2.0L, 3.0L); rl[2] = powl (4.0L, 0.5L); __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]); return 1; } The interesting issue is here that gcc uses for constant mathcalculations gmplibrary to get result. For more complex variants, it uses crt's math. By this it is important for a gcc based runtime to be in the IEEE 754 floating point compatible to gmp's results. Btw new pow, log, and exp variant are slightly faster then variant in msvcrt, but this is more or less just a sideeffect. I admit that this might be not that obvious to users, why results here are different, but for a gcc based toolchain we need to play nice with gcc's internal assumptions. Regards, Kai 
From: Kai Tietz <ktietz70@go...>  20110430 09:51:39

2011/4/30 Kai Tietz <ktietz70@...>: > 2011/4/30 Greg Chicares <gchicares@...>: >> On 20110429 19:55Z, K. Frank wrote: >>> >>> By the way, could someone confirm that mingw does use msvcrt for >>> sqrt and pow? >> >> http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src >> >> Looks to me like sqrt(double) calls into msvcrt. But pow() is a >> different story. Danny implemented it in libmingwex because of >> some qualityofimplementation problem with msvcrt; then people >> complained that the libmingwex version was slower, and I can't >> remember where it wound up, but it's all in the archives. >> >>> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >>> to by IEEE754). >> >> If they should give exactly the same result always, then the >> first thing that needs to be fixed is any standard that allows >> them to differ. The language standards are so much looser than >> the numerical standard, even now, that I'd rather see C and C++ >> catch up to IEEE7541985 before asking them to go beyond. >> >>> Browsing some gcc lists, I did see some comments that suggest that >>> gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would >>> make sqrt and pow consistent (whether or not technically correct). >> >> Not so many years ago (perhaps when msvcrt was written), it was >> thought that correctlyrounded transcendental functions weren't >> feasible in practice, so library authors had a looser attitude >> toward exactness. If you have an ideal implementation of sqrt() >> but a pow() that's only approximate, should you specialcase >> the power of exactly 0.5? If you do, then pow() and sqrt() will >> be consistent, but then the powers >> x^(0.5*(1epsilon)) >> x^(0.5 ) >> x^(0.5*(1+epsilon)) >> might not be monotone. Some would call that a bad tradeoff. > > Well, I spent some efforts into log, exp, and pow implementation. Also > some other basemath functions I improved in mingww64's math library > in two terms. a) Make those functions behave as ISOC99 specification > tells in cornercases. and b) Improve accuracy in terms of gcc's > internal used gmp. As we noticed here result differences. > > The following example illustrates the issues pretty well: > > #include <stdio.h> > #include <math.h> > > double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } }; > long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } }; > int main() > { > double r[3]; > long double rl[3]; > int i; > for (i = 0; i < 3; i++) > __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1], > (r[i] = pow (abc[i][0], abc[i][1]))); > __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0, > 3.0), pow (4.0, 0.5)); > r[0] = pow (2.2, 3.1); > r[1] = pow (2.0, 3.0); > r[2] = pow (4.0, 0.5); > __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]); > > for (i = 0; i < 3; i++) > __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1], > (rl[i] = powl (ab[i][0], ab[i][1]))); > __mingw_printf ("%.20LF %.20LF %.20LF\n", > powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl > (4.0L, 0.5L)); > rl[0] = powl (2.2L, 3.1L); > rl[1] = powl (2.0L, 3.0L); > rl[2] = powl (4.0L, 0.5L); > __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]); > return 1; > } > > The interesting issue is here that gcc uses for constant > mathcalculations gmplibrary to get result. For more complex > variants, it uses crt's math. By this it is important for a gcc based > runtime to be in the IEEE 754 floating point compatible to gmp's > results. > > Btw new pow, log, and exp variant are slightly faster then variant in > msvcrt, but this is more or less just a sideeffect. > > I admit that this might be not that obvious to users, why results here > are different, but for a gcc based toolchain we need to play nice with > gcc's internal assumptions. > > Regards, > Kai > On closer look for long double, we can have rouding issues for long double precission. I see that powl (0.01L, 0.5L) we have a rounding difference to sqrtl (0.01L). Interestingly not for float or double types. So we we might need to specialcase in pow the case for y == 0.5 to solve this. Kai 
From: K. Frank <kfrank29.c@gm...>  20110430 15:15:19

Hello Kai! On Sat, Apr 30, 2011 at 5:51 AM, Kai Tietz <ktietz70@...> wrote: > 2011/4/30 Kai Tietz <ktietz70@...>: >> 2011/4/30 Greg Chicares <gchicares@...>: >>> On 20110429 19:55Z, K. Frank wrote: >>>> >>>> By the way, could someone confirm that mingw does use msvcrt for >>>> sqrt and pow? >>> >>> http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src >>> >>> Looks to me like sqrt(double) calls into msvcrt. But pow() is a >>> different story. Danny implemented it in libmingwex because of >>> some qualityofimplementation problem with msvcrt; then people >>> complained that the libmingwex version was slower, and I can't >>> remember where it wound up, but it's all in the archives. >>> >>>> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >>>> to by IEEE754). >>> ... >>> Not so many years ago (perhaps when msvcrt was written), it was >>> thought that correctlyrounded transcendental functions weren't >>> feasible in practice, so library authors had a looser attitude >>> toward exactness. If you have an ideal implementation of sqrt() >>> but a pow() that's only approximate, should you specialcase >>> the power of exactly 0.5? If you do, then pow() and sqrt() will >>> be consistent, but then the powers >>> x^(0.5*(1epsilon)) >>> x^(0.5 ) >>> x^(0.5*(1+epsilon)) >>> might not be monotone. Some would call that a bad tradeoff. >> >> Well, I spent some efforts into log, exp, and pow implementation. Also >> some other basemath functions I improved in mingww64's math library >> in two terms. a) Make those functions behave as ISOC99 specification >> tells in cornercases. and b) Improve accuracy in terms of gcc's >> internal used gmp. As we noticed here result differences. >> >> The following example illustrates the issues pretty well: >> >> #include <stdio.h> >> #include <math.h> >> >> double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } }; >> long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } }; >> int main() >> { >> double r[3]; >> long double rl[3]; >> int i; >> for (i = 0; i < 3; i++) >> __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1], >> (r[i] = pow (abc[i][0], abc[i][1]))); >> __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0, >> 3.0), pow (4.0, 0.5)); >> r[0] = pow (2.2, 3.1); >> r[1] = pow (2.0, 3.0); >> r[2] = pow (4.0, 0.5); >> __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]); >> >> for (i = 0; i < 3; i++) >> __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1], >> (rl[i] = powl (ab[i][0], ab[i][1]))); >> __mingw_printf ("%.20LF %.20LF %.20LF\n", >> powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl >> (4.0L, 0.5L)); >> rl[0] = powl (2.2L, 3.1L); >> rl[1] = powl (2.0L, 3.0L); >> rl[2] = powl (4.0L, 0.5L); >> __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]); >> return 1; >> } Do I understand your point correctly that you are comparing gcc's compiletime and runtime values for pow? For the record, when I run this on a mingw64 build: gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] I get: C:\>gcc o pow_test pow_test.c C:\>pow_test 2.20000000000000017764^3.10000000000000008882 = 11.52153412678571875460 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 11.52153412678571697825 8.00000000000000000000 2.00000000000000000000 0.00000000000000177636 0.00000000000000000000 0.00000000000000000000 2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414718 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 11.52153412678571414718 8.00000000000000000000 2.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 As I understand it, this means I am seeing a small difference between compile and runtime evaluation only for powl (2.2, 3.1), i.e., the longdouble case. >> The interesting issue is here that gcc uses for constant >> mathcalculations gmplibrary to get result. For more complex >> variants, it uses crt's math. By this it is important for a gcc based >> runtime to be in the IEEE 754 floating point compatible to gmp's >> results. Yes, that's a good point. I hadn't considered the issue of compiletime evaluation, and, ideally, it should agree with runtime evaluation, adding yet another constraint to our wish list. Furthermore, if we take the idea of a crosscompiler seriously, we would also like the compiletime evaluations of a crosscompiler and native compiler to agree. Hmm... So, we would like sqrt (x) and pow (x, 0.5) to agree. We would like compiletime and runtime evaluations to agree. We would like crosscompilers and native compilers to agree. Taken at face value, this would seem to preclude using the platform library and platform hardware to compute things like sqrt and pow (unless we know they comply with some agreedupon standard, such as IEEE754). Just to make the reasoning explicit, if I perform my runtime evaluation on platform A, e.g., using platform A's FPU, then a B>A crosscompiler running on platform B won't have platform A's FPU available to perform consistent compiletime evaluations. So you either give up some of the abovewishedfor consistency, you forgo the potential performance benefits of using the platform's library and / or hardware, or you get everybody to use a common standard. >> ... >> I admit that this might be not that obvious to users, why results here >> are different, but for a gcc based toolchain we need to play nice with >> gcc's internal assumptions. >> >> Regards, >> Kai > > On closer look for long double, we can have rouding issues for long > double precission. I see that powl (0.01L, 0.5L) we have a rounding > difference to sqrtl (0.01L). Interestingly not for float or double > types. So we we might need to specialcase in pow the case for y == > 0.5 to solve this. Be, however, cognizant of Greg's concern. Forcing pow to agree with sqrt just for y = 0.5 has the potential to make pow, in a sense, internally inconsistent, It's a tricky tradeoff. (Ideally, both pow and sqrt would be "exact" and therefore agree with one another automatically.) By the way, I also see a problem with runtime powl for powl (2.2L, 0.5L). It disagrees slightly from sqrtl (2.2L) and from the compiletime value for powl (2.2L, 0.5L). I modified your test program to compare sqrt with pow (x, 0.5) (and sqrtl with powl). For the record: C:\>gcc version gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] C:\>gcc o sqrt_pow sqrt_pow.c C:\>sqrt_pow 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000000000 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000000 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000000 1.04880884817015163080 1.04880884817015163080 1.48323969741913264109 1.48323969741913264109 1.81659021245849494619 1.81659021245849494619 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 1.48323969741913258980, 1.48323969741913258970, 0.00000000000000000011 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 1.04880884817015154699 1.04880884817015154699 1.48323969741913258980 1.48323969741913258980 1.81659021245849499921 1.81659021245849499921 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000011 0.00000000000000000000 where sqrt_pow.c is: #include <stdio.h> #include <math.h> double abc[3] = { 1.1, 2.2, 3.3 }; long double abcl[3] = { 1.1L, 2.2L, 3.3L }; int main() { double r1[3], r2[3], r3[3]; long double rl1[3], rl2[3], rl3[3]; int i; for (i = 0; i < 3; i++) __mingw_printf ("%.20F, %.20F, %.20F\n", (r1[i] = sqrt (abc[i])), (r2[i] = pow (abc[i], 0.5)), (r3[i] = pow (abc[i], 0.5)  sqrt (abc[i]))); __mingw_printf ("%.20F %.20F\n", sqrt (1.1), pow (1.1, 0.5)); __mingw_printf ("%.20F %.20F\n", sqrt (2.2), pow (2.2, 0.5)); __mingw_printf ("%.20F %.20F\n", sqrt (3.3), pow (3.3, 0.5)); r1[0] = sqrt (1.1); r1[1] = sqrt (2.2); r1[2] = sqrt (3.3); __mingw_printf ("%.20F %.20F %.20F\n", r1[0], r1[1], r1[2]); r2[0] = pow (1.1, 0.5); r2[1] = pow (2.2, 0.5); r2[2] = pow (3.3, 0.5); __mingw_printf ("%.20F %.20F %.20F\n", r2[0], r2[1], r2[2]); for (i = 0; i < 3; i++) __mingw_printf ("%.20LF, %.20LF, %.20LF\n", (rl1[i] = sqrtl (abcl[i])), (rl2[i] = powl (abcl[i], 0.5L)), (rl3[i] = powl (abcl[i], 0.5L)  sqrtl (abcl[i]))); __mingw_printf ("%.20LF %.20LF\n", sqrtl (1.1L), powl (1.1L, 0.5L)); __mingw_printf ("%.20LF %.20LF\n", sqrtl (2.2L), powl (2.2L, 0.5L)); __mingw_printf ("%.20LF %.20LF\n", sqrtl (3.3L), powl (3.3L, 0.5L)); rl1[0] = sqrtl (1.1L); rl1[1] = sqrtl (2.2L); rl1[2] = sqrtl (3.3L); __mingw_printf ("%.20LF %.20LF %.20LF\n", rl1[0], rl1[1], rl1[2]); rl2[0] = powl (1.1L, 0.5L); rl2[1] = powl (2.2L, 0.5L); rl2[2] = powl (3.3L, 0.5L); __mingw_printf ("%.20LF %.20LF %.20LF\n", rl2[0], rl2[1], rl2[2]); return 1; } > > Kai Thanks, Kai, for your insight and explanations. K. Frank 
From: K. Frank <kfrank29.c@gm...>  20110430 22:13:15

Hello Bob and Kai! On Sat, Apr 30, 2011 at 5:41 PM, Kai Tietz <ktietz70@...> wrote: > 2011/4/30 Bob Delaney <delaneyrm@...>: >> >> On Apr 30, 2011, at 10:15 AM, K. Frank wrote: >> >>> I get: >>> >>> C:\>gcc o pow_test pow_test.c >>> >>> C:\>pow_test >>> 2.20000000000000017764^3.10000000000000008882 = 11.52153412678571875460 >> >> >> Victor Shoup's NTL and my own fp, to 22 digit accuracy, give: >> >> 2.20000000000000017764^3.10000000000000008882 = 11.52153412678571783832 >> >> so your result is accurate to only 16 digits. Your This is exactly what I would expect (short of looking at the actual binary representation). An IEEE double is 64 bits, with a 52bit mantissa, which, with the "implicit" bit, gives 53 bits of precision. This gives just under 16 decimal digits of precision, which is what you see. >> >> 2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414718 >> >> should be: >> >> 2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414732 >> >> so your result is accurate to about 20 digits. As I understand it, gcc defines long double as the x87 80bit floating point number. This has a 64bit mantissa, and I believe that it does not use an implicit bit, so this gives slightly more than 19 decimal digits of precision, again consistent with your result. >> >> Bob > > Well, what is to be expected as we used here in test application an > %.20F. By specifying here a %.40F its calculated result is > 11.5215341267857141471786519559827866032720 and it is has binary > identity for what gmp calculates for 80bit IEEE 754 floating point. > Which means for our purposes in gcc, it has right accuracy. (Kai's comment seems to confirm that gcc's long double is, indeed, the 80bit floatingpoint number.) > Kai Thanks to all for the insight. K. Frank 
From: JonY <jon_y@us...>  20110501 13:48:59
Attachments:
Message as HTML
signature.asc

From: James K Beard <beardjamesk@ve...>  20110429 21:58:10

The standard process for computation of sqrt(x) has been the use of Newton's method. This means that an approximation to SQRT(x) is found by whatever means and the iteration result = (result_old^2+x)/(2*result_old) Repeated until result = result_old. A check isn't necessary because a maximum iteration count can be obtained from the worstcase accuracy of the initial approximation. Thus there isn't any reason that the librarys should vary in their results. A bit or so may result from the nonalgebraicallyreduced form of the iteration, result = result_old + (x  result_old^2)/(2*result_old) which some of little faith use because the second term can be tested for zero  or near zero. Don't go there. In any case this nonreduced form is less accurate. All this works for complex numbers, too. Watch out for the sign convention for complex numbers, though. James K Beard Original Message From: K. Frank [mailto:kfrank29.c@...] Sent: Friday, April 29, 2011 3:56 PM To: mingw64; MinGW Users List Subject: Re: [Mingww64public] Math library discrepancies that surprised me. Hello Arthur! (I've take the liberty of copying this to the mingw list, as well.) On Fri, Apr 29, 2011 at 9:54 AM, Arthur Norman wrote: > The following short program compares the value of pow(x, 0.5) to that > returned by sqrt(x) for a range of values of x, and prints out any first > case where the two do not agree. Well, this is unfortunate. Ideally they should, and as a qualityofimplementation issue, I think we should be able to expect them to agree. > With the mingw family of compilers I had > expected the Microsoft C library to be in use. > > What I appear to see is > linux (ubuntu 11.04 64bit tested) no output > Mac (snow leopard) no output > gcc (cygwin) no output > gcc3 mnocygwin ) > i686w64mingw32gcc ) all different from each other > x86_64w64mingw32gcc ) but all find discrepancies. Very interesting. > It is a totally fair cop if the Microsoft C library gives different > results in the last bit for floating point elementary functions to gnu > libraries, Well, from the mingw perspective, if the microsoft runtime library is flawed in this way, then it's a legitimate cop for mingw not to fix it. By the way, could someone confirm that mingw does use msvcrt for sqrt and pow? > but is there an obvious reason why the three mingw varients > all give different answers. If I understand your point correctly, if the problem is with msvcrt, then the three mingw variants should all give the same flawed results. > This showed up in regression testing something > much larger across platforms. > > Arthur > ... As I understand it, IEEE754 compliance requires sqrt to be calculated "correctly", i.e., to return the floatingpoint number closest to the exact mathematical value. (The c++ standard, I believe, does not, however, require IEEE754 compliance.) I think, however, the IEEE754 does not require all mathematical functions, and, in particular, pow, to be "correct," in the above sense sense. It would be interesting to print out in binary (or hex) the results for a few of the values that lead to discrepancies with both mingw and a linuxbased gcc, and see whether the problem is in sqrt (for which IEEE754 specifies the result) or in pow (for which I think IEEE754 permits some leeway). In my opinion: mingw / msvcrt should be IEEE754 compliant. Of course, if msvcrt isn't, then it's not mingw's fault. sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required to by IEEE754). Browsing some gcc lists, I did see some comments that suggest that gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would make sqrt and pow consistent (whether or not technically correct). Maybe gcc / linux pow and msvcrt pow are both not quite exact, but this is hidden in your test, because gcc / linux replaces pow (x, 0.5) with sqrt (x), and msvcrt doesn't. (Just speculation.) It is odd, however, that the various mingw's don't agree, and this speaks to a possible bug in mingw. Thanks. K. Frank > ... > ==================== > #include <stdio.h> > #include <math.h> > > int main(int argc, char *argv[]) > { > double a, b, c; > int i; > for (i=0, a=1.0; i<10000000; i++) > { a *= 1.00000001; > b = sqrt(a); c = pow(a, 0.5); > if (b == c) continue; > printf("%.17g %.17g %.17g\n", a, b, c); > return 0; > } > return 0; > } > ==================== > > acn1@... temp > $ gcc expt.c o expt > acn1@... temp > $ ./expt > acn1@... temp > $ gcc3 mnocygwin expt.c o expt > acn1@... temp > $ ./expt > 1.0241210164238141 1.011988644414459 1.0119886444144588 > acn1@... temp > $ i686w64mingw32gcc expt.c o expt > acn1@... temp > $ ./expt > 1.0004603659307831 1.000230156479389 1.0002301564793892 > acn1@... temp > $ x86_64w64mingw32gcc expt.c o expt > acn1@... temp > $ ./expt > 1.0000179101601867 1.0000089550399969 1.0000089550399971 > acn1@... temp > ==============================================================   WhatsUp Gold  Download Free Network Management Software The most intuitive, comprehensive, and costeffective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgoldsd _______________________________________________ Mingww64public mailing list Mingww64public@... https://lists.sourceforge.net/lists/listinfo/mingww64public 
From: James K Beard <beardjamesk@ve...>  20110430 01:38:21

The standard process for computation of sqrt(x) has been the use of Newton's method. This means that an approximation to SQRT(x) is found by whatever means (but important to get good accuracy) and the iteration result = (result_old^2+x)/(2*result_old) Repeated until result = result_old. A check for equality isn't necessary because a maximum iteration count can be obtained from the worstcase accuracy of the initial approximation. Thus there isn't any reason that the libraries should vary in their results. A bit or so may result from the nonalgebraicallyreduced form of the iteration, result = result_old + (x  result_old^2)/(2*result_old) which some of little faith use because the second term can be tested for zero  or near zero. Don't go there. This form is less accurate than the shorter form. All this works for complex numbers, too. Watch out for the sign convention for complex numbers, though. Regarding the log and exponentiation algorithm approach, x^a = exp(a*log(x)), you have rounding error in two operations, as well as inherent accuracy limits, meaning that if you don't have a few guard bits you aren't going to get full mantissa accuracy. The Newton's method will get you within a count or two on the LSB, and with any guard bits at all will get you full mantissa accuracy every time. Why? K digits/bits/whatever in X define about 2K digits/bits/whatever in sqrt(x), while this isn't true for log(*) and exp(*). James K Beard Original Message From: K. Frank [mailto:kfrank29.c@...] Sent: Friday, April 29, 2011 8:51 PM To: mingw64; MinGW Users List Subject: Re: [Mingww64public] [Mingwusers] Math library discrepancies that surprised me. Hi Greg! On Fri, Apr 29, 2011 at 7:38 PM, Greg Chicares <gchicares@...> wrote: > On 20110429 19:55Z, K. Frank wrote: >> >> By the way, could someone confirm that mingw does use msvcrt for >> sqrt and pow? > > http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot= src > > Looks to me like sqrt(double) calls into msvcrt. But pow() is a > different story. Danny implemented it in libmingwex because of > some qualityofimplementation problem with msvcrt; then people > complained that the libmingwex version was slower, and I can't > remember where it wound up, but it's all in the archives. Thanks for that  good to know. From the source link you gave: http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/pow.c?ann otate=1.1.10.1&cvsroot=src the code for pow (the regular double pow (double, double)) has these comments about accuracy. 28: * ACCURACY: 29: * 30: * Relative error: 31: * arithmetic domain # trials peak rms 32: * IEEE 26,26 30000 4.2e16 7.7e17 33: * DEC 26,26 60000 4.8e17 9.1e18 34: * 1/26 < x < 26, with log(x) uniformly distributed. 35: * 26 < y < 26, y uniformly distributed. 36: * IEEE 0,8700 30000 1.5e14 2.1e15 37: * 0.99 < x < 1.01, 0 < y < 8700, uniformly distributed. This commentary suggest that pow is not "exact." This would reasonably explain Arthur seeing pow not match sqrt. But it still leaves the puzzle of why the different mingw versions don't behave the same. One minor puzzle: I don't see anything in the pow code that would let the compiler / library take advantage of hardware support for pow, which I think has been available in the x86 architecture for quite some time now. > >> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >> to by IEEE754). > > If they should give exactly the same result always, then the > first thing that needs to be fixed is any standard that allows > them to differ. Well, it's a question of engineering compromise whether they should always agree. In an ideal world, it would be nice if pow were "exact" (in the sense of closet floatingpoint number). Then if pow were "exact," pow and sqrt should agree. But there is expense in getting pow to be "exact," so it's arguable that the language / IEEE standards should give implementers some flexibility. (And, as you point out below, if pow isn't exact, then forcing pow and sqrt to agree could actually cause other problems.) > The language standards are so much looser than > the numerical standard, even now, that I'd rather see C and C++ > catch up to IEEE7541985 before asking them to go beyond. > >> Browsing some gcc lists, I did see some comments that suggest that >> gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would >> make sqrt and pow consistent (whether or not technically correct). > > Not so many years ago (perhaps when msvcrt was written), it was > thought that correctlyrounded transcendental functions weren't > feasible in practice, so library authors had a looser attitude > toward exactness. If you have an ideal implementation of sqrt() > but a pow() that's only approximate, should you specialcase > the power of exactly 0.5? If you do, then pow() and sqrt() will > be consistent, but then the powers > x^(0.5*(1epsilon)) > x^(0.5 ) > x^(0.5*(1+epsilon)) > might not be monotone. Some would call that a bad tradeoff. I actually agree with your concern. The transformation of pow (x, 0.5) to sqrt (x) that it sounded like gcc / linux was doing sounds like a bad idea if it's being done this simply and naively. Any speculation on why Arthur gets different results with different mingw's? Thanks for your perspective. K. Frank   WhatsUp Gold  Download Free Network Management Software The most intuitive, comprehensive, and costeffective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgoldsd _______________________________________________ Mingww64public mailing list Mingww64public@... https://lists.sourceforge.net/lists/listinfo/mingww64public 
From: James K Beard <beardjamesk@ve...>  20110430 13:36:33

My comments don't seem to be making it into the thread this week, possibly because of a switch between my replyto email address and another. So, if it gets lost, perhaps nightstrike can intervene, or echo the information. Perhaps not. The sqrt() function can be implemented with excellent speed to within a bit of full mantissa accuracy using iterations of Newton's method, as is classically done in libraries. Not so with ln() or exp(), which are classically done by extracting the exponent and using the reducedrange mantissa in a modified Chebychev approximation. Arithmetic coprocessors do sqrt(), ln(), and exp() internally with 80bit floating point, then store in 32, 64, or 80 bit floating point format. So, the coprocessors can get full mantissa accuracy with 32bit and 64bit sqrt() and pow(x,0.5). So, the problem is nonexistent for Intel/PPC and most other common workstation processors. Yet we are looking at software transcendental function libraries. James K Beard Original Message From: Greg Chicares [mailto:gchicares@...] Sent: Friday, April 29, 2011 7:38 PM To: MinGW Users List Cc: mingw64 Subject: Re: [Mingww64public] [Mingwusers] Math library discrepancies that surprised me. On 20110429 19:55Z, K. Frank wrote: > > By the way, could someone confirm that mingw does use msvcrt for > sqrt and pow? http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot= src Looks to me like sqrt(double) calls into msvcrt. But pow() is a different story. Danny implemented it in libmingwex because of some qualityofimplementation problem with msvcrt; then people complained that the libmingwex version was slower, and I can't remember where it wound up, but it's all in the archives. > sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required > to by IEEE754). If they should give exactly the same result always, then the first thing that needs to be fixed is any standard that allows them to differ. The language standards are so much looser than the numerical standard, even now, that I'd rather see C and C++ catch up to IEEE7541985 before asking them to go beyond. > Browsing some gcc lists, I did see some comments that suggest that > gcc (on linux) does try to transform pow (x, 0.5) to sqrt (x). This would > make sqrt and pow consistent (whether or not technically correct). Not so many years ago (perhaps when msvcrt was written), it was thought that correctlyrounded transcendental functions weren't feasible in practice, so library authors had a looser attitude toward exactness. If you have an ideal implementation of sqrt() but a pow() that's only approximate, should you specialcase the power of exactly 0.5? If you do, then pow() and sqrt() will be consistent, but then the powers x^(0.5*(1epsilon)) x^(0.5 ) x^(0.5*(1+epsilon)) might not be monotone. Some would call that a bad tradeoff.   WhatsUp Gold  Download Free Network Management Software The most intuitive, comprehensive, and costeffective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgoldsd _______________________________________________ Mingww64public mailing list Mingww64public@... https://lists.sourceforge.net/lists/listinfo/mingww64public 
From: Kai Tietz <ktietz70@go...>  20110430 15:33:59

2011/4/30 K. Frank <kfrank29.c@...>: > Hello Kai! > > On Sat, Apr 30, 2011 at 5:51 AM, Kai Tietz <ktietz70@...> wrote: >> 2011/4/30 Kai Tietz <ktietz70@...>: >>> 2011/4/30 Greg Chicares <gchicares@...>: >>>> On 20110429 19:55Z, K. Frank wrote: >>>>> >>>>> By the way, could someone confirm that mingw does use msvcrt for >>>>> sqrt and pow? >>>> >>>> http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src >>>> >>>> Looks to me like sqrt(double) calls into msvcrt. But pow() is a >>>> different story. Danny implemented it in libmingwex because of >>>> some qualityofimplementation problem with msvcrt; then people >>>> complained that the libmingwex version was slower, and I can't >>>> remember where it wound up, but it's all in the archives. >>>> >>>>> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >>>>> to by IEEE754). >>>> ... >>>> Not so many years ago (perhaps when msvcrt was written), it was >>>> thought that correctlyrounded transcendental functions weren't >>>> feasible in practice, so library authors had a looser attitude >>>> toward exactness. If you have an ideal implementation of sqrt() >>>> but a pow() that's only approximate, should you specialcase >>>> the power of exactly 0.5? If you do, then pow() and sqrt() will >>>> be consistent, but then the powers >>>> x^(0.5*(1epsilon)) >>>> x^(0.5 ) >>>> x^(0.5*(1+epsilon)) >>>> might not be monotone. Some would call that a bad tradeoff. >>> >>> Well, I spent some efforts into log, exp, and pow implementation. Also >>> some other basemath functions I improved in mingww64's math library >>> in two terms. a) Make those functions behave as ISOC99 specification >>> tells in cornercases. and b) Improve accuracy in terms of gcc's >>> internal used gmp. As we noticed here result differences. >>> >>> The following example illustrates the issues pretty well: >>> >>> #include <stdio.h> >>> #include <math.h> >>> >>> double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } }; >>> long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } }; >>> int main() >>> { >>> double r[3]; >>> long double rl[3]; >>> int i; >>> for (i = 0; i < 3; i++) >>> __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1], >>> (r[i] = pow (abc[i][0], abc[i][1]))); >>> __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0, >>> 3.0), pow (4.0, 0.5)); >>> r[0] = pow (2.2, 3.1); >>> r[1] = pow (2.0, 3.0); >>> r[2] = pow (4.0, 0.5); >>> __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]); >>> >>> for (i = 0; i < 3; i++) >>> __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1], >>> (rl[i] = powl (ab[i][0], ab[i][1]))); >>> __mingw_printf ("%.20LF %.20LF %.20LF\n", >>> powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl >>> (4.0L, 0.5L)); >>> rl[0] = powl (2.2L, 3.1L); >>> rl[1] = powl (2.0L, 3.0L); >>> rl[2] = powl (4.0L, 0.5L); >>> __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]); >>> return 1; >>> } > > Do I understand your point correctly that you are comparing gcc's > compiletime and runtime values for pow? > > For the record, when I run this on a mingw64 build: > > gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] > > I get: > > C:\>gcc o pow_test pow_test.c > > C:\>pow_test > 2.20000000000000017764^3.10000000000000008882 = 11.52153412678571875460 > 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 > 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 > 11.52153412678571697825 8.00000000000000000000 2.00000000000000000000 > 0.00000000000000177636 0.00000000000000000000 0.00000000000000000000 > 2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414718 > 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 > 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 > 11.52153412678571414718 8.00000000000000000000 2.00000000000000000000 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > > As I understand it, this means I am seeing a small difference between > compile and runtime evaluation only for powl (2.2, 3.1), i.e., the > longdouble case. > >>> The interesting issue is here that gcc uses for constant >>> mathcalculations gmplibrary to get result. For more complex >>> variants, it uses crt's math. By this it is important for a gcc based >>> runtime to be in the IEEE 754 floating point compatible to gmp's >>> results. > > Yes, that's a good point. I hadn't considered the issue of compiletime > evaluation, and, ideally, it should agree with runtime evaluation, adding > yet another constraint to our wish list. > > Furthermore, if we take the idea of a crosscompiler seriously, we > would also like the compiletime evaluations of a crosscompiler and > native compiler to agree. > > Hmm... > > So, we would like sqrt (x) and pow (x, 0.5) to agree. We would like > compiletime and runtime evaluations to agree. We would like > crosscompilers and native compilers to agree. > > Taken at face value, this would seem to preclude using the platform > library and platform hardware to compute things like sqrt and pow > (unless we know they comply with some agreedupon standard, > such as IEEE754). > > Just to make the reasoning explicit, if I perform my runtime evaluation > on platform A, e.g., using platform A's FPU, then a B>A crosscompiler > running on platform B won't have platform A's FPU available to perform > consistent compiletime evaluations. > > So you either give up some of the abovewishedfor consistency, you > forgo the potential performance benefits of using the platform's library > and / or hardware, or you get everybody to use a common standard. > >>> ... >>> I admit that this might be not that obvious to users, why results here >>> are different, but for a gcc based toolchain we need to play nice with >>> gcc's internal assumptions. >>> >>> Regards, >>> Kai >> >> On closer look for long double, we can have rouding issues for long >> double precission. I see that powl (0.01L, 0.5L) we have a rounding >> difference to sqrtl (0.01L). Interestingly not for float or double >> types. So we we might need to specialcase in pow the case for y == >> 0.5 to solve this. > > Be, however, cognizant of Greg's concern. Forcing pow to agree with > sqrt just for y = 0.5 has the potential to make pow, in a sense, internally > inconsistent, It's a tricky tradeoff. (Ideally, both pow and sqrt would > be "exact" and therefore agree with one another automatically.) > > By the way, I also see a problem with runtime powl for powl (2.2L, 0.5L). > It disagrees slightly from sqrtl (2.2L) and from the compiletime value > for powl (2.2L, 0.5L). > > I modified your test program to compare sqrt with pow (x, 0.5) (and sqrtl > with powl). For the record: > > C:\>gcc version > gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] > > C:\>gcc o sqrt_pow sqrt_pow.c > > C:\>sqrt_pow > 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000000000 > 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000000 > 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000000 > 1.04880884817015163080 1.04880884817015163080 > 1.48323969741913264109 1.48323969741913264109 > 1.81659021245849494619 1.81659021245849494619 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 > 1.48323969741913258980, 1.48323969741913258970, 0.00000000000000000011 > 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 > 1.04880884817015154699 1.04880884817015154699 > 1.48323969741913258980 1.48323969741913258980 > 1.81659021245849499921 1.81659021245849499921 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 0.00000000000000000000 0.00000000000000000011 0.00000000000000000000 > > where sqrt_pow.c is: > > #include <stdio.h> > #include <math.h> > > double abc[3] = { 1.1, 2.2, 3.3 }; > long double abcl[3] = { 1.1L, 2.2L, 3.3L }; > int main() > { > double r1[3], r2[3], r3[3]; > long double rl1[3], rl2[3], rl3[3]; > int i; > for (i = 0; i < 3; i++) > __mingw_printf ("%.20F, %.20F, %.20F\n", > (r1[i] = sqrt (abc[i])), > (r2[i] = pow (abc[i], 0.5)), > (r3[i] = pow (abc[i], 0.5)  sqrt (abc[i]))); > __mingw_printf ("%.20F %.20F\n", sqrt (1.1), pow (1.1, 0.5)); > __mingw_printf ("%.20F %.20F\n", sqrt (2.2), pow (2.2, 0.5)); > __mingw_printf ("%.20F %.20F\n", sqrt (3.3), pow (3.3, 0.5)); > r1[0] = sqrt (1.1); > r1[1] = sqrt (2.2); > r1[2] = sqrt (3.3); > __mingw_printf ("%.20F %.20F %.20F\n", r1[0], r1[1], r1[2]); > r2[0] = pow (1.1, 0.5); > r2[1] = pow (2.2, 0.5); > r2[2] = pow (3.3, 0.5); > __mingw_printf ("%.20F %.20F %.20F\n", r2[0], r2[1], r2[2]); > > for (i = 0; i < 3; i++) > __mingw_printf ("%.20LF, %.20LF, %.20LF\n", > (rl1[i] = sqrtl (abcl[i])), > (rl2[i] = powl (abcl[i], 0.5L)), > (rl3[i] = powl (abcl[i], 0.5L)  sqrtl (abcl[i]))); > __mingw_printf ("%.20LF %.20LF\n", sqrtl (1.1L), powl (1.1L, 0.5L)); > __mingw_printf ("%.20LF %.20LF\n", sqrtl (2.2L), powl (2.2L, 0.5L)); > __mingw_printf ("%.20LF %.20LF\n", sqrtl (3.3L), powl (3.3L, 0.5L)); > rl1[0] = sqrtl (1.1L); > rl1[1] = sqrtl (2.2L); > rl1[2] = sqrtl (3.3L); > __mingw_printf ("%.20LF %.20LF %.20LF\n", rl1[0], rl1[1], rl1[2]); > rl2[0] = powl (1.1L, 0.5L); > rl2[1] = powl (2.2L, 0.5L); > rl2[2] = powl (3.3L, 0.5L); > __mingw_printf ("%.20LF %.20LF %.20LF\n", rl2[0], rl2[1], rl2[2]); > > return 1; > } > >> >> Kai > > Thanks, Kai, for your insight and explanations. > > > K. Frank I add a special case to mingww64's pow(fl) function about y=0.5 and get now for your adjusted testcase the following results, which are looking much better IMHO. $ ./tst.exe 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000004142 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000857 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000412 1.04880884817015163080 1.04880884817015163080 1.48323969741913264109 1.48323969741913264109 1.81659021245849494619 1.81659021245849494619 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 1.48323969741913258980, 1.48323969741913258980, 0.00000000000000000000 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 1.04880884817015154699 1.04880884817015154699 1.48323969741913258980 1.48323969741913258980 1.81659021245849499921 1.81659021245849499921 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 I used for this a 32bit variant and 64bit variant, which made no differences in output. The issue here is that 1.0 branch still uses the inaccurate variant from Danny, but nowaday's trunk uses for pow, sqrt, log, and exp functions new implementations, which are treating rounding and shortpath calculation in an ISOC variant (as gcc expects internally such a math). Also complexmath part is completely rewritten on mingww64's trunk version. IMHO it might be a good idea to replace step by step the classical runtime functions, so that they are not using here msvcrt in first place. Of course the importsymbols of msvcrt (for compatibility with VC generated objects need to remain in .def exports, but the actual functions should come directly from mingwex. 
From: Kai Tietz <ktietz70@go...>  20110430 15:48:06

2011/4/30 Kai Tietz <ktietz70@...>: > 2011/4/30 K. Frank <kfrank29.c@...>: >> Hello Kai! >> >> On Sat, Apr 30, 2011 at 5:51 AM, Kai Tietz <ktietz70@...> wrote: >>> 2011/4/30 Kai Tietz <ktietz70@...>: >>>> 2011/4/30 Greg Chicares <gchicares@...>: >>>>> On 20110429 19:55Z, K. Frank wrote: >>>>>> >>>>>> By the way, could someone confirm that mingw does use msvcrt for >>>>>> sqrt and pow? >>>>> >>>>> http://cygwin.com/cgibin/cvsweb.cgi/src/winsup/mingw/mingwex/math/?cvsroot=src >>>>> >>>>> Looks to me like sqrt(double) calls into msvcrt. But pow() is a >>>>> different story. Danny implemented it in libmingwex because of >>>>> some qualityofimplementation problem with msvcrt; then people >>>>> complained that the libmingwex version was slower, and I can't >>>>> remember where it wound up, but it's all in the archives. >>>>> >>>>>> sqrt (x) and pow (x, 0.5) ought to give the same result (even if not required >>>>>> to by IEEE754). >>>>> ... >>>>> Not so many years ago (perhaps when msvcrt was written), it was >>>>> thought that correctlyrounded transcendental functions weren't >>>>> feasible in practice, so library authors had a looser attitude >>>>> toward exactness. If you have an ideal implementation of sqrt() >>>>> but a pow() that's only approximate, should you specialcase >>>>> the power of exactly 0.5? If you do, then pow() and sqrt() will >>>>> be consistent, but then the powers >>>>> x^(0.5*(1epsilon)) >>>>> x^(0.5 ) >>>>> x^(0.5*(1+epsilon)) >>>>> might not be monotone. Some would call that a bad tradeoff. >>>> >>>> Well, I spent some efforts into log, exp, and pow implementation. Also >>>> some other basemath functions I improved in mingww64's math library >>>> in two terms. a) Make those functions behave as ISOC99 specification >>>> tells in cornercases. and b) Improve accuracy in terms of gcc's >>>> internal used gmp. As we noticed here result differences. >>>> >>>> The following example illustrates the issues pretty well: >>>> >>>> #include <stdio.h> >>>> #include <math.h> >>>> >>>> double abc[3][2] = { { 2.2, 3.1 }, { 2.0, 3.0 }, { 4.0, 0.5 } }; >>>> long double ab[3][2] = { { 2.2L, 3.1L }, { 2.0L, 3.0L }, { 4.0L, 0.5L } }; >>>> int main() >>>> { >>>> double r[3]; >>>> long double rl[3]; >>>> int i; >>>> for (i = 0; i < 3; i++) >>>> __mingw_printf ("%.20F^%.20F = %.20F\n", abc[i][0], abc[i][1], >>>> (r[i] = pow (abc[i][0], abc[i][1]))); >>>> __mingw_printf ("%.20F %.20F %.20F\n", pow (2.2, 3.1), pow (2.0, >>>> 3.0), pow (4.0, 0.5)); >>>> r[0] = pow (2.2, 3.1); >>>> r[1] = pow (2.0, 3.0); >>>> r[2] = pow (4.0, 0.5); >>>> __mingw_printf ("%.20F %.20F %.20F\n", r[0], r[1], r[2]); >>>> >>>> for (i = 0; i < 3; i++) >>>> __mingw_printf ("%.20LF^%.20LF = %.20LF\n", ab[i][0], ab[i][1], >>>> (rl[i] = powl (ab[i][0], ab[i][1]))); >>>> __mingw_printf ("%.20LF %.20LF %.20LF\n", >>>> powl (2.2L, 3.1L), powl (2.0L, 3.0L) , powl >>>> (4.0L, 0.5L)); >>>> rl[0] = powl (2.2L, 3.1L); >>>> rl[1] = powl (2.0L, 3.0L); >>>> rl[2] = powl (4.0L, 0.5L); >>>> __mingw_printf ("%.20LF %.20LF %.20LF\n", rl[0], rl[1], rl[2]); >>>> return 1; >>>> } >> >> Do I understand your point correctly that you are comparing gcc's >> compiletime and runtime values for pow? >> >> For the record, when I run this on a mingw64 build: >> >> gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] >> >> I get: >> >> C:\>gcc o pow_test pow_test.c >> >> C:\>pow_test >> 2.20000000000000017764^3.10000000000000008882 = 11.52153412678571875460 >> 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 >> 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 >> 11.52153412678571697825 8.00000000000000000000 2.00000000000000000000 >> 0.00000000000000177636 0.00000000000000000000 0.00000000000000000000 >> 2.20000000000000000004^3.09999999999999999991 = 11.52153412678571414718 >> 2.00000000000000000000^3.00000000000000000000 = 8.00000000000000000000 >> 4.00000000000000000000^0.50000000000000000000 = 2.00000000000000000000 >> 11.52153412678571414718 8.00000000000000000000 2.00000000000000000000 >> 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 >> >> As I understand it, this means I am seeing a small difference between >> compile and runtime evaluation only for powl (2.2, 3.1), i.e., the >> longdouble case. >> >>>> The interesting issue is here that gcc uses for constant >>>> mathcalculations gmplibrary to get result. For more complex >>>> variants, it uses crt's math. By this it is important for a gcc based >>>> runtime to be in the IEEE 754 floating point compatible to gmp's >>>> results. >> >> Yes, that's a good point. I hadn't considered the issue of compiletime >> evaluation, and, ideally, it should agree with runtime evaluation, adding >> yet another constraint to our wish list. >> >> Furthermore, if we take the idea of a crosscompiler seriously, we >> would also like the compiletime evaluations of a crosscompiler and >> native compiler to agree. >> >> Hmm... >> >> So, we would like sqrt (x) and pow (x, 0.5) to agree. We would like >> compiletime and runtime evaluations to agree. We would like >> crosscompilers and native compilers to agree. >> >> Taken at face value, this would seem to preclude using the platform >> library and platform hardware to compute things like sqrt and pow >> (unless we know they comply with some agreedupon standard, >> such as IEEE754). >> >> Just to make the reasoning explicit, if I perform my runtime evaluation >> on platform A, e.g., using platform A's FPU, then a B>A crosscompiler >> running on platform B won't have platform A's FPU available to perform >> consistent compiletime evaluations. >> >> So you either give up some of the abovewishedfor consistency, you >> forgo the potential performance benefits of using the platform's library >> and / or hardware, or you get everybody to use a common standard. >> >>>> ... >>>> I admit that this might be not that obvious to users, why results here >>>> are different, but for a gcc based toolchain we need to play nice with >>>> gcc's internal assumptions. >>>> >>>> Regards, >>>> Kai >>> >>> On closer look for long double, we can have rouding issues for long >>> double precission. I see that powl (0.01L, 0.5L) we have a rounding >>> difference to sqrtl (0.01L). Interestingly not for float or double >>> types. So we we might need to specialcase in pow the case for y == >>> 0.5 to solve this. >> >> Be, however, cognizant of Greg's concern. Forcing pow to agree with >> sqrt just for y = 0.5 has the potential to make pow, in a sense, internally >> inconsistent, It's a tricky tradeoff. (Ideally, both pow and sqrt would >> be "exact" and therefore agree with one another automatically.) >> >> By the way, I also see a problem with runtime powl for powl (2.2L, 0.5L). >> It disagrees slightly from sqrtl (2.2L) and from the compiletime value >> for powl (2.2L, 0.5L). >> >> I modified your test program to compare sqrt with pow (x, 0.5) (and sqrtl >> with powl). For the record: >> >> C:\>gcc version >> gcc (GCC) 4.5.2 20101002 (prerelease) [svn/rev.164902  mingww64/oz] >> >> C:\>gcc o sqrt_pow sqrt_pow.c >> >> C:\>sqrt_pow >> 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000000000 >> 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000000 >> 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000000 >> 1.04880884817015163080 1.04880884817015163080 >> 1.48323969741913264109 1.48323969741913264109 >> 1.81659021245849494619 1.81659021245849494619 >> 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 >> 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 >> 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 >> 1.48323969741913258980, 1.48323969741913258970, 0.00000000000000000011 >> 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 >> 1.04880884817015154699 1.04880884817015154699 >> 1.48323969741913258980 1.48323969741913258980 >> 1.81659021245849499921 1.81659021245849499921 >> 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 >> 0.00000000000000000000 0.00000000000000000011 0.00000000000000000000 >> >> where sqrt_pow.c is: >> >> #include <stdio.h> >> #include <math.h> >> >> double abc[3] = { 1.1, 2.2, 3.3 }; >> long double abcl[3] = { 1.1L, 2.2L, 3.3L }; >> int main() >> { >> double r1[3], r2[3], r3[3]; >> long double rl1[3], rl2[3], rl3[3]; >> int i; >> for (i = 0; i < 3; i++) >> __mingw_printf ("%.20F, %.20F, %.20F\n", >> (r1[i] = sqrt (abc[i])), >> (r2[i] = pow (abc[i], 0.5)), >> (r3[i] = pow (abc[i], 0.5)  sqrt (abc[i]))); >> __mingw_printf ("%.20F %.20F\n", sqrt (1.1), pow (1.1, 0.5)); >> __mingw_printf ("%.20F %.20F\n", sqrt (2.2), pow (2.2, 0.5)); >> __mingw_printf ("%.20F %.20F\n", sqrt (3.3), pow (3.3, 0.5)); >> r1[0] = sqrt (1.1); >> r1[1] = sqrt (2.2); >> r1[2] = sqrt (3.3); >> __mingw_printf ("%.20F %.20F %.20F\n", r1[0], r1[1], r1[2]); >> r2[0] = pow (1.1, 0.5); >> r2[1] = pow (2.2, 0.5); >> r2[2] = pow (3.3, 0.5); >> __mingw_printf ("%.20F %.20F %.20F\n", r2[0], r2[1], r2[2]); >> >> for (i = 0; i < 3; i++) >> __mingw_printf ("%.20LF, %.20LF, %.20LF\n", >> (rl1[i] = sqrtl (abcl[i])), >> (rl2[i] = powl (abcl[i], 0.5L)), >> (rl3[i] = powl (abcl[i], 0.5L)  sqrtl (abcl[i]))); >> __mingw_printf ("%.20LF %.20LF\n", sqrtl (1.1L), powl (1.1L, 0.5L)); >> __mingw_printf ("%.20LF %.20LF\n", sqrtl (2.2L), powl (2.2L, 0.5L)); >> __mingw_printf ("%.20LF %.20LF\n", sqrtl (3.3L), powl (3.3L, 0.5L)); >> rl1[0] = sqrtl (1.1L); >> rl1[1] = sqrtl (2.2L); >> rl1[2] = sqrtl (3.3L); >> __mingw_printf ("%.20LF %.20LF %.20LF\n", rl1[0], rl1[1], rl1[2]); >> rl2[0] = powl (1.1L, 0.5L); >> rl2[1] = powl (2.2L, 0.5L); >> rl2[2] = powl (3.3L, 0.5L); >> __mingw_printf ("%.20LF %.20LF %.20LF\n", rl2[0], rl2[1], rl2[2]); >> >> return 1; >> } >> >>> >>> Kai >> >> Thanks, Kai, for your insight and explanations. >> >> >> K. Frank > > I add a special case to mingww64's pow(fl) function about y=0.5 and > get now for your adjusted testcase the following results, which are > looking much better IMHO. > > $ ./tst.exe > 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000004142 > 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000857 > 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000412 > 1.04880884817015163080 1.04880884817015163080 > 1.48323969741913264109 1.48323969741913264109 > 1.81659021245849494619 1.81659021245849494619 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 > 1.48323969741913258980, 1.48323969741913258980, 0.00000000000000000000 > 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 > 1.04880884817015154699 1.04880884817015154699 > 1.48323969741913258980 1.48323969741913258980 > 1.81659021245849499921 1.81659021245849499921 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 > > I used for this a 32bit variant and 64bit variant, which made no > differences in output. > The issue here is that 1.0 branch still uses the inaccurate variant > from Danny, but nowaday's trunk uses for pow, sqrt, log, and exp > functions new implementations, which are treating rounding and > shortpath calculation in an ISOC variant (as gcc expects internally > such a math). Also complexmath part is completely rewritten on > mingww64's trunk version. > > IMHO it might be a good idea to replace step by step the classical > runtime functions, so that they are not using here msvcrt in first > place. Of course the importsymbols of msvcrt (for compatibility with > VC generated objects need to remain in .def exports, but the actual > functions should come directly from mingwex. > PS: The differences shown in results for 32bit aren't caused by inaccuarcy of used math directly. Instead it is caused by the fact that 32bit use FPU registers (which are 80bit always). By this the inline difference without using store, have for double small differences. By first storing values and then printing those stored values you see also for 32bit a delta of zero. Interesting this behavior is on 64bit x86_64 different, as it uses always precise accuracy to store results in registers. By this you have really just a 64bit result for double type, and not indirecty 80bit (as shown in first case). I altered your testcase a bit in following lines ... for (i = 0; i < 3; i++) { r1[i] = sqrt (abc[i]); r2[i] = pow (abc[i], 0.5); r3[i] = pow (abc[i], 0.5)  sqrt (abc[i]); __mingw_printf ("%.20F, %.20F, %.20F\n", r1[i], r2[i], r3[i]); } ... and get now $ ./tst.exe 1.04880884817015163080, 1.04880884817015163080, 0.00000000000000000000 1.48323969741913264109, 1.48323969741913264109, 0.00000000000000000000 1.81659021245849494619, 1.81659021245849494619, 0.00000000000000000000 1.04880884817015163080 1.04880884817015163080 1.48323969741913264109 1.48323969741913264109 1.81659021245849494619 1.81659021245849494619 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 1.04880884817015154699, 1.04880884817015154699, 0.00000000000000000000 1.48323969741913258980, 1.48323969741913258980, 0.00000000000000000000 1.81659021245849499921, 1.81659021245849499921, 0.00000000000000000000 1.04880884817015154699 1.04880884817015154699 1.48323969741913258980 1.48323969741913258980 1.81659021245849499921 1.81659021245849499921 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 0.00000000000000000000 as result. it is an interesting fact that this happens. Regards, Kai 
From: JonY <jon_y@us...>  20110501 14:38:07
Attachments:
Message as HTML
signature.asc

From: K. Frank <kfrank29.c@gm...>  20110501 15:28:53

Hello Jon! On Sun, May 1, 2011 at 10:17 AM, JonY <jon_y@...> wrote: > On 5/1/2011 22:03, James K Beard wrote: >> Now, about that math library problem... >> ... >> AFAIK, all currently produced FPUs are IEEE compliant. Are there some out >> there that are still bucking the inevitable? > >  Putting mingw lists back  > > I heard older AMDs (late 90s200Xs era) don't have 80bit FPUs, but they > do have 64bit FPUs, so it wasn't strange that results do differ > slightly between CPUs. Maybe that has changed for newer CPUs. > > Does the long double epsilon 1.08420217248550443401e19L accuracy > absolutely require an 80bit FPU? If your willing to do your floatingpoint math in software, then no. But if you want to get 19decimaldigit precision in your FPU, it pretty much needs to be 80 bits. (You can always trade off exponent bits for mantissa bits, but it's much better stick with IEEE choice.) The 80bit floatingpoint format has a 64bit mantissa (and, I believe, no implicit bit): log_base_10 (2^64) = 19.27, so 64 bits gets you a little over 19 decimal digits of precision. So to get 19 decimal digits, you need 64 bits plus exponent plus sign, and that gives you something pretty similar to the 80bit format, and it would be hard to justify making a nonstandard choice. > IMHO, we don't need to produce results more accurate than that, but its > a bonus if it doesn't impact performance too much. Historically the approach (within a compiler / platform family) has been to support whatever floatingpoint you support in software, and use floatingpoint hardware when available. If you want to do heavy floatingpoint computation, it's up to you to make sure you have the necessary floatingpoint hardware or eat the (significant) performance hit. Nowadays I would think that floatingpoint hardware is pretty much a given on desktops and servers. Maybe this isn't true for some embedded systems (that gcc targets)  I just don't know. Best. K. Frank 
From: James K Beard <beardjamesk@ve...>  20110501 15:17:54

I would hope that in the interests of performance and maintainable code that we simply allow rounding bits on any targets that use 64bit FPU that may still be in service somewhere. I would not expect such processors to be doing such critical work that this would be a real problem. One thing that is real is consequences of the alternative. If gcc supports full mantissa accuracy for the entire FP library, this requirement will generate a full software scientific function library with extended precision that is loaded with every program compiled with <math.h> or the equivalent in other languages, i.e. that uses floating point exponentiation or a sine, cosine, or logarithm. That amounts to bloat for 99.99% of the users and is likely frivolous for the two or three users that actually get different results. And, this appendage will likely be enthusiastically supported for a generation of volunteers, whose time might be better spent fleshing out the C++ feature set to catch up with the standard, etc. Think the Pareto Principle. Climbing the curve too far past the knee... James K Beard Original Message From: JonY [mailto:jon_y@...] Sent: Sunday, May 01, 2011 10:18 AM To: mingww64public@... Cc: MinGW Users List Subject: Re: [Mingww64public] [Mingwusers] Math library discrepancies that surprised me. On 5/1/2011 22:03, James K Beard wrote: > Now, about that math library problem  it seems to me that asking > full mantissa accuracy and agreement for 32bit and 64bit floating > point is reasonable. Differences between FPUs makes full mantissa > accuracy and agreement at 80 bits a bit of a question, and I wouldn't > demand it so that the FPU can be used  unless you are using your own > algorithms for quad precision or other situation where the FPU doesn't provide the result. > > AFAIK, all currently produced FPUs are IEEE compliant. Are there some > out there that are still bucking the inevitable?  Putting mingw lists back  I heard older AMDs (late 90s200Xs era) don't have 80bit FPUs, but they do have 64bit FPUs, so it wasn't strange that results do differ slightly between CPUs. Maybe that has changed for newer CPUs. Does the long double epsilon 1.08420217248550443401e19L accuracy absolutely require an 80bit FPU? IMHO, we don't need to produce results more accurate than that, but its a bonus if it doesn't impact performance too much. 
From: James K Beard <beardjamesk@ve...>  20110501 17:45:53

K. Frank: You raise the very important point of embedded processors that may not have an FPU. I venture that it's the responsibility of the designer to use processors that have an FPU when the application requires one. The middle ground is the problem  where some FP is used but software FP is acceptable. And that is the main practical focus. There are three software FP implementation scenarios that must be addressed by gcc: (1) Software FP for 32bit results and 64bit results, (2) Quad precision, and (3) Multiple precision. What has historically been done as a solution to (1) above, from the days when Intel and Motorola had an optional FPU as a second chip, was to have a software FP library that was loaded when the target did not have an FPU, and the FP instructions were trapped and emulated with software FP. Whatever the details of the implementation, a separate FPU emulator that provides IEEEcompliant 80bit arithmetic with loads and stores for IEEE integer and FP data types strikes me as appropriate. Quad precision is supported by preservation of the carry bits in the FPU for 80bit operations, which support their use in stringing together more than one FPU 80bit operation to support 128bit and larger software FP. Without the FPU and architecturedependent exploitation of the preserved carry bits (a lot of data for multiplication), you are stuck with the much slower stringingtogetherofintegers extended precision floating point which is best left to those libraries. Multiple precision is a separate library for userspecified mantissa length and (usually) a 16bit exponent. The Von Niemann format of a biased exponent with one sign bit for the whole FP word may or may not be a good idea and its principal advantage, monotonic increase in value when the FP number is interpreted as an integer, is largely lost, so I think that's best left to the implementer. And, when an FPU is not present, this is where quad precision must live, so why not put it in the gcc plan and documentation that way? James K Beard Original Message From: K. Frank [mailto:kfrank29.c@...] Sent: Sunday, May 01, 2011 11:29 AM To: mingw64; MinGW Users List Subject: Re: [Mingww64public] [Mingwusers] Math library discrepancies that surprised me. Hello Jon! On Sun, May 1, 2011 at 10:17 AM, JonY <jon_y@...> wrote: > On 5/1/2011 22:03, James K Beard wrote: >> Now, about that math library problem... >> ... >> AFAIK, all currently produced FPUs are IEEE compliant. Are there some out >> there that are still bucking the inevitable? > >  Putting mingw lists back  > > I heard older AMDs (late 90s200Xs era) don't have 80bit FPUs, but they > do have 64bit FPUs, so it wasn't strange that results do differ > slightly between CPUs. Maybe that has changed for newer CPUs. > > Does the long double epsilon 1.08420217248550443401e19L accuracy > absolutely require an 80bit FPU? If your willing to do your floatingpoint math in software, then no. But if you want to get 19decimaldigit precision in your FPU, it pretty much needs to be 80 bits. (You can always trade off exponent bits for mantissa bits, but it's much better stick with IEEE choice.) The 80bit floatingpoint format has a 64bit mantissa (and, I believe, no implicit bit): log_base_10 (2^64) = 19.27, so 64 bits gets you a little over 19 decimal digits of precision. So to get 19 decimal digits, you need 64 bits plus exponent plus sign, and that gives you something pretty similar to the 80bit format, and it would be hard to justify making a nonstandard choice. > IMHO, we don't need to produce results more accurate than that, but its > a bonus if it doesn't impact performance too much. Historically the approach (within a compiler / platform family) has been to support whatever floatingpoint you support in software, and use floatingpoint hardware when available. If you want to do heavy floatingpoint computation, it's up to you to make sure you have the necessary floatingpoint hardware or eat the (significant) performance hit. Nowadays I would think that floatingpoint hardware is pretty much a given on desktops and servers. Maybe this isn't true for some embedded systems (that gcc targets)  I just don't know. Best. K. Frank   WhatsUp Gold  Download Free Network Management Software The most intuitive, comprehensive, and costeffective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgoldsd _______________________________________________ Mingww64public mailing list Mingww64public@... https://lists.sourceforge.net/lists/listinfo/mingww64public 
From: JonY <jon_y@us...>  20110503 01:24:31
Attachments:
Message as HTML
signature.asc

From: Kai Tietz <ktietz70@go...>  20110503 09:12:00

2011/5/3 JonY <jon_y@...>: > On 5/3/2011 04:05, Kai Tietz wrote: >> 2011/5/2 Peter Rockett <p.rockett@...>: >>> On 02/05/2011 17:00, Keith Marshall wrote: >>>> On 02/05/11 16:30, Charles Wilson wrote: >>>>>> So, we would like sqrt (x) and pow (x, 0.5) to agree. >>> >>> The point below about offtopic diversions into precision is valid but >>> can I reiterate, there is no good reason why x^(0.5) calculated two >>> different ways should agree since the two different routes will >>> accumulate different rounding errors. Adding a special case to force >>> agreement for x^n when n = 0.5 will introduce a discontinuity into the >>> pow() function and is a very bad idea. The value of returned by pow(x, >>> 0.5) might be less accurate compared to sqrt(x) but that is part of the >>> fundamental issue with using floatingpoint numbers: they are only ever >>> approximations. You have to take that fact on board. >> >> Correct, the ways of calculation of pow (x, 0.5) and sqrt (x) are different. >> Nevertheless is it an issue in gcc. As if you are writing pow (1.1, >> 0.5) or sqrt (1.1)  which means gcc optimize it via gmp to a >> precalculated result  it has identical values. But by using >> mathlibrary results in a  expectable  difference in result. >> So IMHO pow should special case here to provide same result as pow >> does for y = 0.5. > > I would agree with the others that its a horrible hack to treat "0.5" > special, I don't remember reading anything about requiring pow() results > to agree with sqrt(). > > Can we at least follow a standard spec like: > <http://pubs.opengroup.org/onlinepubs/009695399/functions/pow.html>; > > We can remove the special case, right? Well, as some people aren't able to use replyall here, but have wise suggestion about implementation details they don't satisfy themself, I am a bit annoyed. Anyway, I see that this specialcasing is neither explicitly demanded, but also it isn't explicitly forbidden. So I see here not much reason for removing this. It might be that I haven't seen the forbidden thing on the spec. The sqrt specialcase has at least one advantage over the log2/exp2 variant. First it is faster, as just one FPU command is used to calculate sqrt. Secondly, is provides better result and compatible one to gcc's internal used softfloatingpoint library. As I don't see in spec that this special casing shouldn't be done, I think we will keep it on mingww64's side. Well, it might be a good idea on mingw.org's side to recheck the exp2, exp2f, exp2l, and expl implementation, as here you can easily prove that the calculated results have a rounding issue, which leads for combined operations  like exp2 (y * log2 (fabs (x))  to miscalculations over two eps. But well, this might be treated as typical inaccuracy ... Regards, Kai 
From: Kai Tietz <ktietz70@go...>  20110503 10:58:19

2011/5/3 Peter Rockett <p.rockett@...>: > On 03/05/2011 10:11, Kai Tietz wrote: >> 2011/5/3 JonY<jon_y@...>: >>> On 5/3/2011 04:05, Kai Tietz wrote: >>>> 2011/5/2 Peter Rockett<p.rockett@...>: >>>>> On 02/05/2011 17:00, Keith Marshall wrote: >>>>>> On 02/05/11 16:30, Charles Wilson wrote: >>>>>>>> So, we would like sqrt (x) and pow (x, 0.5) to agree. >>>>> The point below about offtopic diversions into precision is valid but >>>>> can I reiterate, there is no good reason why x^(0.5) calculated two >>>>> different ways should agree since the two different routes will >>>>> accumulate different rounding errors. Adding a special case to force >>>>> agreement for x^n when n = 0.5 will introduce a discontinuity into the >>>>> pow() function and is a very bad idea. The value of returned by pow(x, >>>>> 0.5) might be less accurate compared to sqrt(x) but that is part of the >>>>> fundamental issue with using floatingpoint numbers: they are only ever >>>>> approximations. You have to take that fact on board. >>>> Correct, the ways of calculation of pow (x, 0.5) and sqrt (x) are different. >>>> Nevertheless is it an issue in gcc. As if you are writing pow (1.1, >>>> 0.5) or sqrt (1.1)  which means gcc optimize it via gmp to a >>>> precalculated result  it has identical values. But by using >>>> mathlibrary results in a  expectable  difference in result. >>>> So IMHO pow should special case here to provide same result as pow >>>> does for y = 0.5. >>> I would agree with the others that its a horrible hack to treat "0.5" >>> special, I don't remember reading anything about requiring pow() results >>> to agree with sqrt(). >>> >>> Can we at least follow a standard spec like: >>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/pow.html>; >>> >>> We can remove the special case, right? >> Well, as some people aren't able to use replyall here, but have wise >> suggestion about implementation details they don't satisfy themself, I >> am a bit annoyed. >> Anyway, I see that this specialcasing is neither explicitly demanded, >> but also it isn't explicitly forbidden. So I see here not much reason >> for removing this. > > The reason is not introducing a discontinuity into pow() around 0.5. > Forcing the two functions to agree may seem like a good idea but in > reality, you are creating another much more insidious problem. You are > embedding a potentially disastrous property into pow(). > >> It might be that I haven't seen the forbidden >> thing on the spec. The sqrt specialcase has at least one advantage >> over the log2/exp2 variant. First it is faster, as just one FPU >> command is used to calculate sqrt. Secondly, is provides better result >> and compatible one to gcc's internal used softfloatingpoint library. >> As I don't see in spec that this special casing shouldn't be done, I >> think we will keep it on mingww64's side. > > Sorry, but that's a really bad call IMO for the reason stated above. You > are ignoring the consequences to try to (mistakenly) enforce > mathematical correctness. Even basic FP ops like addition and > subtraction deviate from pure mathematical notions of equality. For > example, 1 + a, where a is less than the machine epsilon will return > precisely unity! Floating point numbers ain't reals and you can't force > them to behave exactly like reals... > > P. I am aware of this what you are telling me, but there is in pow function already specialcasing for x^y for y with integer values in range for y >= INT_MIN and y <= INT_MAX. As here more precise calculation via powi is used. This makes that results of pow having already discontinuity. So pow functions doesn't have the requirement of having continuity. >> Well, it might be a good idea on mingw.org's side to recheck the exp2, >> exp2f, exp2l, and expl implementation, as here you can easily prove >> that the calculated results have a rounding issue, which leads for >> combined operations  like exp2 (y * log2 (fabs (x))  to >> miscalculations over two eps. But well, this might be treated as >> typical inaccuracy ... >> >> Regards, >> Kai Regards, Kai 
From: K. Frank <kfrank29.c@gm...>  20110503 13:32:30

Hello Kai and Everybody! A couple of comments about whether to use special cases for pow (x, y) for things like y = 0.5 or y = integer... On Tue, May 3, 2011 at 6:58 AM, Kai Tietz wrote: > 2011/5/3 Peter Rockett <p.rockett@...>: >> On 03/05/2011 10:11, Kai Tietz wrote: >> ... >> The reason is not introducing a discontinuity into pow() around 0.5. >> Forcing the two functions to agree may seem like a good idea but in >> reality, you are creating another much more insidious problem. You are >> embedding a potentially disastrous property into pow(). >> >>> It might be that I haven't seen the forbidden >>> thing on the spec. The sqrt specialcase has at least one advantage >>> over the log2/exp2 variant. First it is faster, as just one FPU >>> command is used to calculate sqrt. Secondly, is provides better result >>> and compatible one to gcc's internal used softfloatingpoint library. >>> As I don't see in spec that this special casing shouldn't be done, I >>> think we will keep it on mingww64's side. >> ... > I am aware of this what you are telling me, but there is in pow > function already specialcasing for x^y for y with integer values in > range for y >= INT_MIN and y <= INT_MAX. As here more precise > calculation via powi is used. This makes that results of pow having > already discontinuity. > So pow functions doesn't have the requirement of having continuity. > ... I'm not aware that either the C standard or the IEEE754 standard specifies one behavior or the other. (I think that they don't, but I could be wrong.) I believe that is has been commonplace over the years to use these special cases for pow. (This is just the sense I have from personal recollection and oral tradition.) I think speed optimization has been the reason, rather than improved accuracy or agreement with things like sqrt, but perhaps it's been a mixture of those. Of course, just because there is precedent, doesn't mean it's the right way to go. If I had to choose between pow (x, 0.5) agreeing with sqrt (x) and pow (x, y) being monotone in y, I think I would choose the latter (i.e., choose not to use the special cases). But there are good arguments on both sides. One solution that would be attractive to me would be to implement pow "exactly" (i.e., bankers' rounding to the nearest representable value), and then use all the special cases you want as a speed optimization. Now the special cases won't change the value of pow (assuming sqrt is properly implemented), so the special cases won't introduce any nonmonotonicity or "discontinuities" into pow. What I don't know is how hard or expensive this is. I'm pretty sure it's not too hard to calculate exp "exactly." But I think that calculating log is harder. Perhaps James Beard knows whether it's practical to calculate pow "exactly," and maybe even how to do it. (Another approach, which I very much don't recommend, is to use pow to calculate sqrt. You get reduced speed and reduced accuracy for sqrt, but at least you do get sqrt and pow to agree without introducing nonmonotonicity into pow.) Lastly, I have to believe that this is a wellunderstood and "solved" problem in the numericalanalysis community (of which I am not a part). Perhaps it would make sense to find out how the experts who have doing this for fiftyplus years deal with this issue. > Regards, > Kai And thanks to the mingw and mingw64 teams who have invested the time and effort to actually write the code we're so breezily discussing. My comments are in no way meant to be a criticism of those efforts. I'm just offering up one man's opinion from the user community in the hope that it might be helpful. Bets regards. K. Frank 