From: Charles Wilson <cwilso11@us...>  20110502 15:31:00

On 5/2/2011 2:34 AM, Keith Marshall wrote: > On 01/05/11 14:36, K. Frank wrote: >> [snip] > My complaint is that, long after the last bit of precision has been > interpreted, gdtoa (which is at the heart of printf()'s floating point > output formatting) will continue to spew out extra decimal digits, based > solely on the residual remainder from the preceding digit conversion, > with an arbitrary number of extra zero bits appended. (Thus, gdtoa > makes the unjustified and technically invalid assumption that the known > bit precision may be arbitrarily extended to ANY LENGTH AT ALL, simply > by appending zero bits in place of the less significant unknowns). > >> I don't agree with this. > > Well, you are entitled to your own opinion; we may agree to disagree. > >> In most cases, it is not helpful to print out a long double to more >> than twenty decimal place, but sometimes it is. The point is that it >> is not the case that floatingpoint numbers represent all real >> numbers inexactly; rather, they represent only a subset of real >> numbers exactly. But the problem is, if I send you a floating point number that represents the specific real number which I have in mind, exactly, YOU don't know that. All you have is a particular floating point number that represents the range [valueulps/2, value+ulps/w). You have no idea that I actually INTENDED to communicate EXACTLY "value" to you. Ditto for the results of a long computation. I get back as the output some f.p. rep  I don't *know* that the actual result of the computation is exactly the value of that rep. All I know is the result is not representable with more accuracy by any OTHER f.p. rep with the same precision. >> If I happen to be representing a real number exactly >> with a long double, I might wish to print it out with lots (more than >> twenty) decimal digits. Such a use case is admittedly rare, but not >> illegitimate. No, that's always illegitimate (i.e. misleading). Imagine I wrote a scientific paper concerning an experiment with 17 trials, and my individual measurements had a precision of 3 sig. digits (all of the same order of magnitude). I can't say that the mean result had 20 sig. digits simply because I can't represent the result of dividing by 17 exactly using only 3 sig. digits. It's not accurate to extend the precision of the sum by appending zeros, simply so that I get more digits of "apparent precision" after dividing by 17. My paper would be rejected  and rightly so. > This may be acceptable, provided you understand that those additional > digits are of questionable accuracy. When you attempt to claim an > accuracy which simply isn't available, then I would consider that it is > most definitely illegitimate. > >> Let's say that I have a floatingpoint number with ten binary digits, so >> it gives about three decimal digits of precision (2^10 == 1024 ~= 10^3). >> I can use such a number to represent 1 + 2^10 exactly. > > Well, yes, you can if we allow you an implied eleventh bit, as most > significant, normalised to 1; thus your mantissa bit pattern becomes: > > 10000000001B > >> I can print this number out exactly in decimal using ten digits after >> the decimal point: 1.0009765625. That's legitimate, and potentially a >> good thing. But 10000000001B does NOT mean "1 + 2^10". It means "with the limited precision I have, I can't represent the actual value of real number R more accurately with any other bit pattern than this one". > Sorry, but I couldn't disagree more. See, here you are falling into the > gdtoa trap. You have an effective bit precision of eleven, which gives you: > > 11 / log2(10) = 3.311 decimal digits (i.e. 3 full digits) > >> If I limit myself to three digits after the decimal point I get 1.001 >> (rounding up). > > You can't even claim that. Once again, you are confusing decimal places > and significant digits. You may claim AT MOST 3 significant digits, and > those are 1.00; (significance begins at the leftmost nonzero digit > overall, not just after the radix point). See, here's the problem: 1.0009765625 means: I can distinguish between the following three numbers: a) 1.0009765624 b) 1.0009765625 c) 1.0009765626 and the real number R is closer to (b) than to (a) or (c). But with 10 bits, you CAN'T distinguish between those three numbers: the same 10 bit pattern must be used to represent all three. In fact, the best you can do with 10 bits is distinguish between the following three reps: (a) 0.999 (b) 1.00 (c) 1.01 (Note, because of the normalization shift between (a) and (b/c), the accuracy *appears* to change in magnitude by a factor of 10, but that's simply an artifact of the base10 representation  in base two the normalization shift effect is only a factor of 2, not 10). Once again, the real number R is closer to (b) than to (a) or (c), and that's the best you can do with 10 bits (3 significant decimal digits). >> Sure, this is not a common use case, but I would prefer that the software >> let me do this, and leave it up to me to know what I'm doing. > > I would prefer that software didn't try to claim the indefensible. Your > 1.0009765625 example represents 11 significant decimal digits of > precision. To represent that in binary, you need a minimum of: > > 11 * log2(10) = 36.54 bits > > (which we must round up to 37 bits). While a mantissa of 10000000001B > MAY equate exactly to your example value of 1 + 2^10, it is guaranteed > to be exact to 11 decimal digits of significance only if its normalised > 37bit representation is: > > 1000000000100000000000000000000000000B > > Since you have only 11 bits of guaranteed binary precision available, > you are making a sweeping assumption about those extra 26 bits; (they > must ALL be zero). If you know for certain that this is so, then okay, > but since you don't have those bits available, you have no technically > defensible basis, in the general case, for making such an assumption; > your argument is flawed, and IMO technically invalid. Agree. But this whole discussion is rather beside the point I think, which started with a real discrepancy in the actual bit pattern produced by sqrt(x) and pow(x, 0.5). e.g. > So, we would like sqrt (x) and pow (x, 0.5) to agree. We would like > compiletime and runtime evaluations to agree. We would like > crosscompilers and native compilers to agree. This is a binary bit pattern issue, not a gdtoa base 10 conversion issue.  Chuck 