From: <sisyphus1@op...>  20131208 13:15:56

Hi, For the double precision value '95e20', on mingw.org ports of gcc3.4.5 and 4.7.0, I find that double d = 95e20; and double d = strtod("95e20", NULL); assign different values to the variable "d". There are, of course, other values that present the same bug, though chances of hitting such values are not particularly high. Here's a demo script that shows up the problem wrt 95e20 and two other values (3263787126e10 and 1464e20) that also suffer the same bug: /*****************************/ #include <stdio.h> #include <stdlib.h> int main(void) { double d1 = 95e20, d2; d2 = strtod("95e20", NULL); if(d1 == d2) printf("strtod works as expected\n"); else printf("strtod seems to be buggy\n"); printf("%.15e %.15e\n", d1, d2); d1 = 3263787126e10; d2 = strtod("3263787126e10", NULL); if(d1 == d2) printf("strtod works as expected\n"); else printf("strtod seems to be buggy\n"); printf("%.15e %.15e\n", d1, d2); d1 = 1464e20; d2 = strtod("1464e20", NULL); if(d1 == d2) printf("strtod works as expected\n"); else printf("strtod seems to be buggy\n"); printf("%.15e %.15e\n", d1, d2); return 0; } /*****************************/ For me, that outputs: ############################## strtod seems to be buggy 9.500000000000001e+021 9.499999999999999e+021 strtod seems to be buggy 3.263787126000000e+019 3.263787126000000e+019 strtod seems to be buggy 1.464000000000000e+023 1.464000000000000e+023 ############################## (The mpfr library reveals that it's the strtod() function that's producing the incorrect value  and that assigning the values directly results in a *correct* value.) My questions: Has this been fixed in the current mingw.org port of gcc4.8.x ? If so, which package should I download (and from where) in order to get this current gcc4.8 ? Last time I installed MinGW, I grabbed mingwgetinst20120426.exe which, when executed, gave me everything I wanted, but I'm unsure as to the approved method of upgrading mingw/msys. Cheers, Rob 
From: Keith Marshall <keithmarshall@us...>  20131208 15:57:10

On 08/12/13 13:14, sisyphus1@... wrote: > For the double precision value '95e20', on mingw.org ports of gcc3.4.5 and > 4.7.0, I find that > > double d = 95e20; > and > double d = strtod("95e20", NULL); > > assign different values to the variable "d". Presumably as a consequence of different rounding criteria, between GCC itself, and *Microsoft's* strtod() implementation. > There are, of course, other values that present the same bug, though chances > of hitting such values are not particularly high. > > Here's a demo script that shows up the problem wrt 95e20 and two other > values (3263787126e10 and 1464e20) that also suffer the same bug: > > /*****************************/ > #include <stdio.h> > #include <stdlib.h> > > int main(void) { > double d1 = 95e20, d2; > > d2 = strtod("95e20", NULL); > > if(d1 == d2) printf("strtod works as expected\n"); > else printf("strtod seems to be buggy\n"); > > printf("%.15e %.15e\n", d1, d2); > > > d1 = 3263787126e10; > > d2 = strtod("3263787126e10", NULL); > > if(d1 == d2) printf("strtod works as expected\n"); > else printf("strtod seems to be buggy\n"); > > printf("%.15e %.15e\n", d1, d2); Sigh! Why is it, that so many programmers commit this elementary error? "%.15e" is asking for sixteen significant digits of decimal display precision, (one before, and fifteen after the radix point), from a 64bit binary representation which can provide, at most, fifteen guaranteed decimal digits, (it's close to, but not quite sixteen; you will get the extra one, if the most significant digit is less than nine). > d1 = 1464e20; > > d2 = strtod("1464e20", NULL); > > if(d1 == d2) printf("strtod works as expected\n"); This is an unsafe comparison; (testing for equality of any floating point entities is always unsafe, except in the specific case of both being explicitly zero). > else printf("strtod seems to be buggy\n"); > > printf("%.15e %.15e\n", d1, d2); > > return 0; > } > /*****************************/ > > For me, that outputs: > > ############################## > strtod seems to be buggy > 9.500000000000001e+021 9.499999999999999e+021 > strtod seems to be buggy > 3.263787126000000e+019 3.263787126000000e+019 > strtod seems to be buggy > 1.464000000000000e+023 1.464000000000000e+023 > ############################## > > (The mpfr library reveals that it's the strtod() function that's producing > the incorrect value  and that assigning the values directly results in a > *correct* value.) When you exceed the limit of available, representable precision, who is to say which is correct? In your first example, the most significant digit is nine, so you can count on only fifteen significant digits of precision; with proper rounding, both results are "correct". In the second and third cases, the most significant digits are less than nine, so we get an extra digit of representable precision, and we can't see a difference in the output to sixteen digits, (but there may be a one bit difference in the least significant bit of the internal binary representations, which would be sufficient to defeat the test for equality). > My questions: > Has this been fixed in the current mingw.org port of gcc4.8.x ? Doubtful, since you are pointing the finger at a Microsoft function, not at GCC, where, in reality, you should be questioning the validity of your own expectations. Floating point representations of numbers are inexact, and rounding errors will occur, (and will accumulate in calculations). If you really need to work to this level of precision, then you need to use a multiple precision floating point library; you simply cannot expect that to work reliably, in conjunction with normal floating point functions such as strtod(); neither can you rely on normal printf formatting, for display of extended precision results.  Regards, Keith. 
From: <sisyphus1@op...>  20131209 01:28:34

Original Message From: Keith Marshall Sent: Monday, December 09, 2013 2:57 AM To: mingwusers@... Subject: Re: [Mingwusers] bug in strtod() On 08/12/13 13:14, sisyphus1@... wrote: >> For the double precision value '95e20', on mingw.org ports of gcc3.4.5 >> and >> 4.7.0, I find that >> >> double d = 95e20; >> and >> double d = strtod("95e20", NULL); >> >> assign different values to the variable "d". > > Presumably as a consequence of different rounding criteria, between GCC > itself, and *Microsoft's* strtod() implementation. Yes, I did wonder about, but perhaps hadn't *properly* considered it. I discarded that possibility mainly because, on the same machine, the mingw64.sf compilers (I've tried 32bit 4.6.3 and 4.7.3) don't exhibit the bug. Is it just that they somehow avoid the Microsoft implementation ? I've tried both with and without "#define __USE_MINGW_ANSI_STDIO 1"  and it made no difference. And there's also (on the same machine) a Microsoft Platform SDK compiler, using msvcrt.dll runtime AFAIK. It, too, does not exhibit the bug. But then, when I build (same machine) using MSVC++ 7.0/msvcr70.dll, the same bug reappears. Not sure what to make of all that. If it *is* Microsoft's strtod() that's somehow causing the discrepancy, is there some way of working around that with mingw.org compilers ? Just to address some of the other issues raised by you guys: The discrepancy in the values is in the least significant bit  so it's not very likely to cause significant error in practice. And most people, I'm sure, would happily leave it at that. Personally, I would like to be able to use strtod() to assign the *correct* internal (binary) value to a double  not because I need that level of accuracy, but because I think computer programs should assign and use correct values. As I understand it, for any specific rounding rule (eg up, down, nearest), there can be *only* one correct internal representation of a base 10 double. Does any of that make any difference ? (Does any of that make any sense ?) Thanks for considering, guys. Cheers, Rob 
From: Manolo <manoloter@gm...>  20131209 11:24:40

> The discrepancy in the values is in the least significant bit  so it's not > very likely to cause significant error in practice. And most people, I'm > sure, would happily leave it at that. > Personally, I would like to be able to use strtod() to assign the *correct* > internal (binary) value to a double  not because I need that level of > accuracy, but because I think computer programs should assign and use > correct values. > As I understand it, for any specific rounding rule (eg up, down, nearest), > there can be *only* one correct internal representation of a base 10 double. Read http://www.validlab.com/goldberg/paper.pdf 
From: Peter Rockett <p.rockett@sh...>  20131209 12:35:16

On 09/12/13 11:24, Manolo wrote: >> The discrepancy in the values is in the least significant bit  so it's not >> very likely to cause significant error in practice. And most people, I'm >> sure, would happily leave it at that. >> Personally, I would like to be able to use strtod() to assign the *correct* >> internal (binary) value to a double  not because I need that level of >> accuracy, but because I think computer programs should assign and use >> correct values. >> As I understand it, for any specific rounding rule (eg up, down, nearest), >> there can be *only* one correct internal representation of a base 10 double. > Read http://www.validlab.com/goldberg/paper.pdf > I'm not sure that referring the OP to David Goldberg's seminal but notparticularlyaccessible paper (complete with mathematical proofs!) is going to be very helpful. A gentler introduction might be the first chapter of "Scientific Computing" by Michael T Heath. Keith has eloquently explained the issue with floatingpoint arithmetic but the OP seems still to be using the word "bug" for what is really a (predictable) "feature" of trying to represent a real number with a finitewidth base2 number. I suggest the OP makes the step from using "bug", as if this is all the compiler's fault. The more you know about floating point arithmetic, the more neurotic you should become about using it! P. 
From: Peter Rosin <peda@ly...>  20131209 13:50:40

On 20131209 02:27, sisyphus1@... wrote: > > > Original Message > From: Keith Marshall > Sent: Monday, December 09, 2013 2:57 AM > Subject: Re: [Mingwusers] bug in strtod() > > On 08/12/13 13:14, sisyphus1@... wrote: >>> For the double precision value '95e20', on mingw.org ports of gcc3.4.5 >>> and >>> 4.7.0, I find that >>> >>> double d = 95e20; >>> and >>> double d = strtod("95e20", NULL); >>> >>> assign different values to the variable "d". >> >> Presumably as a consequence of different rounding criteria, between GCC >> itself, and *Microsoft's* strtod() implementation. > > Yes, I did wonder about, but perhaps hadn't *properly* considered it. > I discarded that possibility mainly because, on the same machine, the > mingw64.sf compilers (I've tried 32bit 4.6.3 and 4.7.3) don't exhibit the > bug. Is it just that they somehow avoid the Microsoft implementation ? I've > tried both with and without "#define __USE_MINGW_ANSI_STDIO 1"  and it made > no difference. > > And there's also (on the same machine) a Microsoft Platform SDK compiler, > using msvcrt.dll runtime AFAIK. It, too, does not exhibit the bug. > > But then, when I build (same machine) using MSVC++ 7.0/msvcr70.dll, the same > bug reappears. > > Not sure what to make of all that. > If it *is* Microsoft's strtod() that's somehow causing the discrepancy, is > there some way of working around that with mingw.org compilers ? > > Just to address some of the other issues raised by you guys: > > The discrepancy in the values is in the least significant bit  so it's not > very likely to cause significant error in practice. And most people, I'm > sure, would happily leave it at that. > Personally, I would like to be able to use strtod() to assign the *correct* > internal (binary) value to a double  not because I need that level of > accuracy, but because I think computer programs should assign and use > correct values. > As I understand it, for any specific rounding rule (eg up, down, nearest), > there can be *only* one correct internal representation of a base 10 double. > > Does any of that make any difference ? (Does any of that make any sense ?) 95e20 = 95 * 10^20 = 95 * 5^20 * 2^20 = 9059906005859375 * 2^20 = 0b100000001011111110111110111111001011010111110000101111 * 2^20 (using 0bXXX to denote binary numbers) So, we want: mantissa: 100000001011111110111110111111001011010111110000101111 exponent: 20 or, normalized: mantissa: 1.00000001011111110111110111111001011010111110000101111 exponent: 73 But, the mantissa has one bit too many for a 64bit double, so we should get: mantissa: 1.0000000101111111011111011111100101101011111000010111 exponent: 73 or mantissa: 1.0000000101111111011111011111100101101011111000011000 exponent: 73 depending on what rounding rules are in place. There are a couple of different rounding rules in the IEEE standard, two result in the former answer (towards zero and towards inf) and three result in the latter (towards nearest both with ties to even and ties away from zero, and towards +inf). Relying on GCC and the MS runtime to agree on rounding rules in every possible combination seems fragile, no? I don't know if the rounding rule is standardized in the relevant contexts. I don't know if strtod is required to round correctly down to the last bit. Cheers, Peter 
From: Keith Marshall <keithmarshall@us...>  20131209 15:32:57

On 09/12/13 13:50, Peter Rosin wrote: > Relying on GCC and the MS runtime to agree on rounding > rules in every possible combination seems fragile, no? > > I don't know if the rounding rule is standardized in the relevant > contexts. I don't know if strtod is required to round correctly > down to the last bit. AIUI, it is not; I don't even think that IEEE 754 requires an input routine to choose the best approximation  exact representations are comparatively rare. The best we can hope for is +/ 1 variation in the least significant bit of the mantissa representation. Such variations are not a bug, and OP should desist in his effort to classify them as such.  Regards, Keith. 
From: Peter Rosin <peda@ly...>  20131210 09:16:26

On 20131209 16:32, Keith Marshall wrote: > On 09/12/13 13:50, Peter Rosin wrote: >> Relying on GCC and the MS runtime to agree on rounding >> rules in every possible combination seems fragile, no? >> >> I don't know if the rounding rule is standardized in the relevant >> contexts. I don't know if strtod is required to round correctly >> down to the last bit. > > AIUI, it is not; I don't even think that IEEE 754 requires an input > routine to choose the best approximation  exact representations are > comparatively rare. The best we can hope for is +/ 1 variation in the > least significant bit of the mantissa representation. Such variations > are not a bug, and OP should desist in his effort to classify them as such. I completely agree that you should expect to be disappointed if you expect anything better than that. However, are you sure it is not a bug? Consider this quote from [1]: If the subject sequence has the decimal form and at most DECIMAL_DIG (defined in <float.h>) significant digits, the result should be correctly rounded. (DECIMAL_DIG has to be 10 or higher, it is probably 15) The "subject sequence" in the given example is 95e20, and I see only two significant decimal digits, i.e. 95. That's nowhere near 15, or even 10 for that matter. So, this looks suspiciously like buggy behaviour to me, assuming everything is set to round to nearest (which seems plausible). Cheers, Peter [1] http://pubs.opengroup.org/onlinepubs/009695299/functions/strtod.html 
From: <sisyphus1@op...>  20131210 10:44:10

Original Message From: Peter Rosin > If the subject sequence has the decimal form and at most > DECIMAL_DIG (defined in <float.h>) significant digits, > the result should be correctly rounded. > > (DECIMAL_DIG has to be 10 or higher, it is probably 15) I expected 15, too  but I've just checked and it's 21. Nevertheless, in searching for the discrepancies, I've been limiting myself to a (max) 15decimal mantissa  and searching the (not quite entire) exponent range of 300 to 300. And it seems that the discrepancies occur only in a fairly limited range. For all of these discrepancies (which, incidentally, are all integer values), when I rewrite them in normalised format of d.ddddddddddddddEp, the precision p is in the range 16 to 23. I still need to run a program to check that aspect more exactly  it may be that there's an occasional value just outside that range  but I've not seen any precision anywhere near as high as 30, nor anywhere near as low as 10. Cheers, Rob 
From: <sisyphus1@op...>  20131211 03:17:42

Original Message From: Keith Marshall > Such variations > are not a bug, and OP should desist in his effort to classify them as > such. Have no fear  it is my solemn intention to *never* again use that word (or any derivative of it) on this mailing list. As of yesterday, I shall endeavour to refer only to "discrepancies", despite the extra typing involved. I was going to debate further the actual status of these discrepancies, but I'm tired of it ... as, too, are (presumably) others. I take it that, for whatever reason, there's no interest here in doing anything about these offbyone ulp discrepancies. That's ok ... it's not often that it matters to me (in practice), and on those rare occasions where it *does* matter, I can either use a different compiler or resort to the mpfr library. Both of those options involve a degree of inconvenience to me, but not to an unacceptable degree  and, as I said, most of the time it's not something with which I'll have to contend. FWIW (and this is something I'm posting for my own records as much as anything else), I ran a program for a few hours that checked hundreds of thousands of fp values (maximum 15 random decimal digit mantissa, exponents in the range 300 to 300 for each mantissa) and kept track of the minimum and maximum values for which a discrepancy existed. At the point that I closed the program, it had established a range (that had not altered for quite some time) of: 3.60687602659667e+016 to 3.11296e+026 Not every value in that range is subject to the discrepancy  most are not. There were, however, 2 (and only 2) values outside of that range  and by a long way at that; one of which is not an integer value: 9.928207e119 and 8.1661e+157 Cheers, Rob 
From: Peter Rosin <peda@ly...>  20131211 08:27:51

On 20131211 04:16, sisyphus1@... wrote: > > > Original Message > From: Keith Marshall > >> Such variations >> are not a bug, and OP should desist in his effort to classify them as >> such. > > Have no fear  it is my solemn intention to *never* again use that word (or > any derivative of it) on this mailing list. > As of yesterday, I shall endeavour to refer only to "discrepancies", despite > the extra typing involved. > > I was going to debate further the actual status of these discrepancies, but > I'm tired of it ... as, too, are (presumably) others. > > I take it that, for whatever reason, there's no interest here in doing > anything about these offbyone ulp discrepancies. > That's ok ... it's not often that it matters to me (in practice), and on > those rare occasions where it *does* matter, I can either use a different > compiler or resort to the mpfr library. > Both of those options involve a degree of inconvenience to me, but not to an > unacceptable degree  and, as I said, most of the time it's not something > with which I'll have to contend. Looking at [1] indicates that GCC has seen fixes in this area in that last couple of weeks. The other affected component is the Microsoft runtime. While I agree that changes in behaviour between GCC versions indicate that GCC is not perfect, it only matters if the latest GCC is affected, since getting someone to ship a bugfixed version of some old GCC over this "discrepancy" is presumably hard. That bugzilla entry also indicates that GCC uses mpfr, but also that mpfr might not be bugfree either. But hey, bugfree, what's that? Maybe mpfr problems is what's going on with your outliers? It is also not clear from the description if the bugs you have found are in the MS strtod, or if they are in GCC. I assume that the MS strtod is the likely culprit, but in that case you should also file a bug report with Microsoft. I of course don't know what they'll respond, but it is not certain that MS considers POSIX all that important and simply marks it wontfix... MinGW can do two things: A. Try to influence GCC so that it rounds correctly. B. Provide i better strtod in mingwrt (I think that should be possible). A is probably best dealt with by filing GCC bug reports for specific cases you encounter. For B, I'm sure patches are thoughtfully considered. Cheers, Peter [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21718 
From: <sisyphus1@op...>  20131212 01:16:23

Original Message From: Peter Rosin >> I take it that, for whatever reason, there's no interest here in doing >> anything about these offbyone ulp discrepancies. > > Looking at [1] indicates that GCC has seen fixes in this area in that last > couple of weeks. I think [1] relates to a problem in the way that literals are evaluated  strtod() is not involved in that one AIUI. There are, however, other reports where strtod() is offbyone  such as: http://sourceware.org/bugzilla/show_bug.cgi?id=3479 And then there's http://www.exploringbinary.com/incorrectlyroundedconversionsingccandglibc/ which contains this example: [quote] 0.500000000000000166533453693773481063544750213623046875, which equals 2^1 + 2^53 + 2^54, converts to this binary number: 0.100000000000000000000000000000000000000000000000000011 It has 54 significant bits, with bits 53 and 54 equal to 1. The correctly rounded value — in binary scientific notation — is 1.000000000000000000000000000000000000000000000000001 x 2^1 gcc and glibc strtod() both compute the converted value as 1.0000000000000000000000000000000000000000000000000001 x 2^1 which is one ULP below the correctly rounded value. [end quote] Interestingly, this one is similar to 95e20 in that it can correctly be represented in 54 bits, and the problem is in the rounding to 53 bits. I don't know if mingw is affected by any of these  they all seem to involve a mantissa of more that 15 significant decimal digits. (One wonders how they might be greeted if presented to *this* list.) > The other affected component is the Microsoft runtime. > While I agree that changes in behaviour between GCC versions indicate that > GCC is not perfect, it only matters if the latest GCC is affected, since > getting someone to ship a bugfixed version of some old GCC over this > "discrepancy" is presumably hard. That bugzilla entry also indicates that > GCC uses mpfr, but also that mpfr might not be bugfree either. But hey, > bugfree, what's that? Maybe mpfr problems is what's going on with your > outliers? I'm not checking *just* against mpfr. I first find a value that assigns differently depending upon whether it is assigned directly as a literal, or assigned via strtod(). Then I call upon mpfr to identify which is correct and which is wrong. Even if mpfr identifies incorrectly, there's still that initial discrepancy  both renditions cannot be correct. You've already verified for me that, wrt 95e20, mpfr is correct and the result I get when I use the mingw.org's port of strtod() is wrong. > It is also not clear from the description if the bugs you have found are > in the MS strtod, or if they are in GCC. How might I determine that ? > I assume that the MS strtod is > the likely culprit, but in that case you should also file a bug report > with Microsoft. Absolutely correct  if the problem is with MS, then file a report there. But the thing I don't understand is how it can be a problem with MS yet not affect the mingw64 compilers that I have running on this same machine. > I of course don't know what they'll respond, but it is > not certain that MS considers POSIX all that important and simply marks > it wontfix... > > MinGW can do two things: > A. Try to influence GCC so that it rounds correctly. > B. Provide i better strtod in mingwrt (I think that should be possible). > > A is probably best dealt with by filing GCC bug reports for specific > cases you encounter. Do you think that's worth doing in this particular instance ? If so, I don't mind raising this issue there. However, my feeling is that, unless I can show it to be a "gcc" issue, they're just going to tell me to take it up with the vendor .... back to square 1. And I can't find *any* version of gcc other than mingw's ports that exhibits these discrepancies. (I'm currently using their port of gcc4.7.0; the same dicrepancies appear on their port of gcc3.4.5. In addition to mingw64 ports of gcc, I also have a ubuntu12.04 and debian wheezy at hand  not one discrepancy with any of those other gcc compilers.) > For B, I'm sure patches are thoughtfully considered. Yes, there's no comeback to that :) At one of the links I provided earlier: http://www.exploringbinary.com/howstrtodworksandsometimesdoesnt/ I find: [quote] strtod() is written in C and lives in a file called dtoa.c. (This file also contains the dtoa() function — double to ASCII string.) In that file are flags that allow strtod() to be configured. ............................ (There are many more flags; see the code for details.) [end quote] If I'm not up against an MS runtime issue, then I think that's probably the place to start looking. Where do I find the version of this file that mingw.org uses ? Thanks again for your thoughts on this, Peter. In any case, I'll probably put this aside for a while  feeling quite deflated at the moment. Cheers, Rob 
From: Keith Marshall <keithmarshall@us...>  20131212 16:50:02

On 12/12/13 01:15, sisyphus1@... wrote: > Interestingly, this one is similar to 95e20 in that it can correctly be > represented in 54 bits, and the problem is in the rounding to 53 bits. > I don't know if mingw is affected by any of these  they all seem to involve > a mantissa of more that 15 significant decimal digits. (One wonders how they > might be greeted if presented to *this* list.) The 64bit IEEE 754 format provides: 1 sign bit; 11 exponent bits; 53 mantissa bits; (52 stored, 1 implied). The maximum number of significant decimal digits which that 53bit mantissa can *reliably* represent is given by: 53 log10( 2 ) = 15.95 While this is close to 16, you cannot really claim a fraction of a decimal digit, so only 15 are actually available; (what the fraction means is that, over part of the range of representable values, there may be 16 significant digits available, but over the entire range, you can claim only 15). The example which you cite: > And then there's > http://www.exploringbinary.com/incorrectlyroundedconversionsingccandglibc/ > > which contains this example: > [quote] > 0.500000000000000166533453693773481063544750213623046875, > which equals 2^1 + 2^53 + 2^54, converts to this binary number: > 0.100000000000000000000000000000000000000000000000000011 > > It has 54 significant bits, with bits 53 and 54 equal to 1. The correctly > rounded value — in binary scientific notation — is > 1.000000000000000000000000000000000000000000000000001 x 2^1 > > gcc and glibc strtod() both compute the converted value as > 1.0000000000000000000000000000000000000000000000000001 x 2^1 > which is one ULP below the correctly rounded value. > [end quote] offers a sample number with 54 significant decimal digits; this requires a minimum of: 54 / log10( 2 ) = 179.38 binary digits (which we *must* round away from zero, to 180 bits), to represent it exactly. To claim that it can be represented in fewer, (e.g. the 54 suggested), is mathematically bogus; such a claim relies on a grossly unsafe assumption  that you know precisely what the state of the 126 unrepresented bits must be. Obviously, this is sheer balderdash. Clearly, on input you may continue to generate mantissa bits beyond the 53 which you can fit in the IEEE 754 64bit representation, and then round to 53. If you generate only 54, (as your example suggests), then if the 54th bit is a zero, rounding entails simply discarding it. OTOH, if it is a one, (again as in your example), you may round either away from zero, or toward zero, and either could be regarded as round to nearest, as both are equally near, (if you actually believe the accuracy of you data to this extreme precision, in the first place). To break such a tie, mathematicians normally advocate rounding away from zero, (which is reasonable, since there is likely to be another one bit somewhere in the following unrepresented bits, which would tend to bias the rounding toward the next representable value away from zero. However, there are some, (such as Donald Knuth), who advocate rounding the tie to the nearest *even* representable value, on the grounds that, statistically, around 50% of values will round toward zero, while the remaining 50% round away from zero, and the resulting rounding errors will tend to cancel, in calculations. Depending on which of these rounding rules a particular implementation adopts, (round to nearest, with ties away from zero, or round to nearest even value), you may see differences in the least significant bit; to say that one method is definitively wrong, and the other correct, is something of a stretch. The reality is that you are arguing about a miniscule difference in an arena where approximations prevail, and for most uses, both approximations are equally good. I'll say it one last time: if you really need precision down to this level, then standard IEEE 754 64bit representations simply aren't good enough. You need to use a multiple precision math library, together with its own sanctioned I/O methods. I'm not prepared to further debate the niceties of MS vs. GCC interpretation, at this insignificant level of precision.  Regards, Keith. 
From: Peter Rockett <p.rockett@sh...>  20131213 10:39:39

On 12/12/13 16:49, Keith Marshall wrote: > On 12/12/13 01:15, sisyphus1@... wrote: >> Interestingly, this one is similar to 95e20 in that it can correctly be >> represented in 54 bits, and the problem is in the rounding to 53 bits. >> I don't know if mingw is affected by any of these  they all seem to involve >> a mantissa of more that 15 significant decimal digits. (One wonders how they >> might be greeted if presented to *this* list.) > The 64bit IEEE 754 format provides: > > 1 sign bit; > 11 exponent bits; > 53 mantissa bits; (52 stored, 1 implied). > > The maximum number of significant decimal digits which that 53bit > mantissa can *reliably* represent is given by: > > 53 log10( 2 ) = 15.95 > > While this is close to 16, you cannot really claim a fraction of a > decimal digit, so only 15 are actually available; (what the fraction > means is that, over part of the range of representable values, there may > be 16 significant digits available, but over the entire range, you can > claim only 15). > > The example which you cite: > >> And then there's >> http://www.exploringbinary.com/incorrectlyroundedconversionsingccandglibc/ >> >> which contains this example: >> [quote] >> 0.500000000000000166533453693773481063544750213623046875, >> which equals 2^1 + 2^53 + 2^54, converts to this binary number: >> 0.100000000000000000000000000000000000000000000000000011 >> >> It has 54 significant bits, with bits 53 and 54 equal to 1. The correctly >> rounded value — in binary scientific notation — is >> 1.000000000000000000000000000000000000000000000000001 x 2^1 >> >> gcc and glibc strtod() both compute the converted value as >> 1.0000000000000000000000000000000000000000000000000001 x 2^1 >> which is one ULP below the correctly rounded value. >> [end quote] > offers a sample number with 54 significant decimal digits; this requires > a minimum of: > > 54 / log10( 2 ) = 179.38 binary digits > > (which we *must* round away from zero, to 180 bits), to represent it > exactly. To claim that it can be represented in fewer, (e.g. the 54 > suggested), is mathematically bogus; such a claim relies on a grossly > unsafe assumption  that you know precisely what the state of the 126 > unrepresented bits must be. Obviously, this is sheer balderdash. > > Clearly, on input you may continue to generate mantissa bits beyond the > 53 which you can fit in the IEEE 754 64bit representation, and then > round to 53. If you generate only 54, (as your example suggests), then > if the 54th bit is a zero, rounding entails simply discarding it. OTOH, > if it is a one, (again as in your example), you may round either away > from zero, or toward zero, and either could be regarded as round to > nearest, as both are equally near, (if you actually believe the accuracy > of you data to this extreme precision, in the first place). To break > such a tie, mathematicians normally advocate rounding away from zero, > (which is reasonable, since there is likely to be another one bit > somewhere in the following unrepresented bits, which would tend to bias > the rounding toward the next representable value away from zero. > However, there are some, (such as Donald Knuth), who advocate rounding > the tie to the nearest *even* representable value, on the grounds that, > statistically, around 50% of values will round toward zero, while the > remaining 50% round away from zero, and the resulting rounding errors > will tend to cancel, in calculations. FWIW: Section 4.3.3 of IEEE754 mandates the use of rounding ties to even for binary representations as the default. (Sadly, I have copy of this august document!) > Depending on which of these rounding rules a particular implementation > adopts, (round to nearest, with ties away from zero, or round to nearest > even value), you may see differences in the least significant bit; to > say that one method is definitively wrong, and the other correct, is > something of a stretch. Alternatively, it may be that the OP is somehow accessing two different implementations of strtod in his tests. Depending on exactly how they are written may determine how/where the rounding errors build up. I recall a number of years ago, a colleague trying to calculate the singular value decomposition (SVD) of some truculent matrices. (SVD is a complicated algorithm, for those who don't know it!) Only one SVD implementation, written by G. W. (Pete) Stewart [http://www.cs.umd.edu/~stewart/] gave consistently sensible results. My explanation is that, because he is a master of computational linear algebra, Stewart's implementation kept the rounding errors as good as they could be, whereas other implementers with a lesser grasp failed to do this . So we might be taking about seemingly minor differences in what/how intermediate results are calculated in the strtod implementations. > The reality is that you are arguing about a > miniscule difference in an arena where approximations prevail, and for > most uses, both approximations are equally good. So true! If you do any serious floatingpoint calculations and endup with errors in the 1 ulp range you should be punching the air with hysterical delight! P. ...snip 
From: <sisyphus1@op...>  20131215 14:14:45

Original Message From: Peter Rockett > So true! If you do any serious floatingpoint calculations and endup > with errors in the 1 ulp range you should be punching the air with > hysterical delight! You'd love perl then  it has 1ulp discrepancies all over the place ;) As an added bonus, you can get quite a few 2ulp discrepancies  and I think I've seen the odd 3ulp discrepancy there. This was what sent me down the path of seeing what strtod (which perl does *not* use in assigning its floatingpoint values) was reporting in the first place. Cheers, Rob 
From: <sisyphus1@op...>  20131210 03:35:54

Original Message From: Peter Rosin Sent: Tuesday, December 10, 2013 12:50 AM To: MinGW Users List Subject: Re: [Mingwusers] bug in strtod() > On 20131209 02:27, sisyphus1@... wrote: > > > > > > Original Message > > From: Keith Marshall > >> Presumably as a consequence of different rounding criteria, between GCC > >> itself, and *Microsoft's* strtod() implementation. > > > > Yes, I did wonder about, but perhaps hadn't *properly* considered it. > > I discarded that possibility mainly because, on the same machine, the > > mingw64.sf compilers (I've tried 32bit 4.6.3 and 4.7.3) don't exhibit > > the > > bug. Is it just that they somehow avoid the Microsoft implementation ? > > I've > > tried both with and without "#define __USE_MINGW_ANSI_STDIO 1"  and it > > made > > no difference. > > > > And there's also (on the same machine) a Microsoft Platform SDK > > compiler, > > using msvcrt.dll runtime AFAIK. It, too, does not exhibit the bug. > > > > But then, when I build (same machine) using MSVC++ 7.0/msvcr70.dll, the > > same > > bug reappears. > > > > Not sure what to make of all that. > > If it *is* Microsoft's strtod() that's somehow causing the discrepancy, > > is > > there some way of working around that with mingw.org compilers ? > > > > Just to address some of the other issues raised by you guys: > > > > The discrepancy in the values is in the least significant bit  so it's > > not > > very likely to cause significant error in practice. And most people, I'm > > sure, would happily leave it at that. > > Personally, I would like to be able to use strtod() to assign the > > *correct* > > internal (binary) value to a double  not because I need that level of > > accuracy, but because I think computer programs should assign and use > > correct values. > > As I understand it, for any specific rounding rule (eg up, down, > > nearest), > > there can be *only* one correct internal representation of a base 10 > > double. > > > > Does any of that make any difference ? (Does any of that make any sense > > ?) > > 95e20 = 95 * 10^20 = 95 * 5^20 * 2^20 = 9059906005859375 * 2^20 > = 0b100000001011111110111110111111001011010111110000101111 * 2^20 > > (using 0bXXX to denote binary numbers) > > So, we want: > > mantissa: 100000001011111110111110111111001011010111110000101111 > exponent: 20 > > or, normalized: > > mantissa: 1.00000001011111110111110111111001011010111110000101111 > exponent: 73 > > > But, the mantissa has one bit too many for a 64bit double, so we should > get: > > mantissa: 1.0000000101111111011111011111100101101011111000010111 > exponent: 73 > > or > > mantissa: 1.0000000101111111011111011111100101101011111000011000 > exponent: 73 > > depending on what rounding rules are in place. There are a couple of > different rounding rules in the IEEE standard, two result in the former > answer (towards zero and towards inf) Yes, that's the value assigned by strtod()  but other results show that "rounding to nearest" is the mode in use, and it shouldn't switch from one mode to another. > and three result in the latter Yes  this is the value that gets assigned when I assign the literal 95e20 directly to the variable. > (towards nearest both with ties to even and ties away from zero, and > towards +inf). Relying on GCC and the MS runtime to agree on rounding > rules in every possible combination seems fragile, no? I don't really see that fragility is inevitable. AFAICT, they're both using round to nearest  and any discrepancy (whether it arises from fragility or not) is a discrepancy that should not be taken for granted as "expected and correct". Admittedly, my understanding of the relationship between the compiler and the runtime is not up to scratch. I certainly don't understand how the runtime can be blamed when different compilers (using that very same runtime on the very same machine) produce different results. As I mentioned previously, these discrepancies that arise when using the mingw.org compiler or MSVC++ 7.0, don't arise when I use a mingw64 compiler or Platform SDK (Microsoft (R) C/C++ Optimizing Compiler Version 14.00.40310.41). > I don't know if the rounding rule is standardized in the relevant > contexts. I don't know if strtod is required to round correctly > down to the last bit. Jesus ... I never even gave that a thought !!! How awful would *that* be ... thankfully, I can't google up anything that supports that notion, but I have found posts that imply that strtod *is* required to round correctly, right down to the last bit. Here's some reading for anyone interested : http://www.exploringbinary.com/howstrtodworksandsometimesdoesnt/ http://pubs.opengroup.org/onlinepubs/009695299/functions/strtod.html http://www.exploringbinary.com/incorrectlyroundedconversionsingccandglibc/ I'll expand on these links in a reply (later) to Keith's last response. Maybe there's something in there (or elsewhere on the web), that I need to read as regards the way that strtod() is supposed to work. Thank you for the constructive and considered reply, Peter. I appreciate it. Cheers, Rob 
From: Eli Zaretskii <eliz@gn...>  20131208 16:46:35

> From: <sisyphus1@...> > Date: Mon, 9 Dec 2013 00:14:59 +1100 > > For the double precision value '95e20', on mingw.org ports of gcc3.4.5 and > 4.7.0, I find that > > double d = 95e20; > and > double d = strtod("95e20", NULL); > > assign different values to the variable "d". GCC might use something other than strtod to convert double literals. E.g., latest versions of GCC are linked against MPFR, right? > int main(void) { > double d1 = 95e20, d2; > > d2 = strtod("95e20", NULL); > > if(d1 == d2) printf("strtod works as expected\n"); > else printf("strtod seems to be buggy\n"); As Keith points out, you shouldn't compare floatingpoint numbers for equality. It is much better to display their difference, you might be surprised how insignificant it can be. In general, any difference that is less that machine epsilon relative to the value of either d1 or d2 is the usual roundup you should expect in FP calculations. > ############################## > strtod seems to be buggy > 9.500000000000001e+021 9.499999999999999e+021 > strtod seems to be buggy > 3.263787126000000e+019 3.263787126000000e+019 > strtod seems to be buggy > 1.464000000000000e+023 1.464000000000000e+023 > ############################## I see nothing unexpected in this output. > (The mpfr library reveals that it's the strtod() function that's producing > the incorrect value  and that assigning the values directly results in a > *correct* value.) A floatingpoint value is almost always "incorrect", because machine representation has finite precision and uses base2 mantissa. > My questions: > Has this been fixed in the current mingw.org port of gcc4.8.x ? Are you asking about GCC or about strtod? If the latter, that function comes from the MS library, not from GCC folks. Again, look at the difference, and then decide whether you really care. 
From: Peter J. Farley III <pjfarley3@ea...>  20131208 21:06:54

> Original Message > From: sisyphus1@... > [mailto:sisyphus1@...] > Sent: Sunday, December 08, 2013 8:15 AM > To: mingwusers@... > Subject: [Mingwusers] bug in strtod() > > Hi, > > For the double precision value '95e20', on mingw.org ports of gcc3.4.5 > and 4.7.0, I find that > > double d = 95e20; > and > double d = strtod("95e20", NULL); > > assign different values to the variable "d". > There are, of course, other values that present the same bug, though > chances of hitting such values are not particularly high. <Snipped> > My questions: > Has this been fixed in the current mingw.org port of gcc4.8.x ? If you want exact decimal precision, use the decimal floating point data types (decimal32, decimal64, decimal128) and the corresponding decimal functions. Classic C doubles and long doubles are BINARY representations of decimal numbers, and will always be subject to inexactness and rounding errors. That said, I do not know if MinGW GCC 4.8.x supports decimal floating point hardware on current Intel CPU's or just the GCC software library implementation. Perhaps the MinGW GCC maintainer can answer that question. If you are interested, information on the GCC IEEE decimal library routines can be seen here: http://gcc.gnu.org/onlinedocs/gccint/Decimalfloatlibraryroutines.ht ml Peter  