From: SourceForge.net <no...@so...> - 2010-11-29 13:47:53
|
Bugs item #3105247, was opened at 2010-11-08 07:56 Message generated for change (Comment added) made by kennykb You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3105247&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 48. Number Handling Group: development: 8.6b1.1 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Christoph Bauer (fridolin) Assigned to: Kevin B KENNY (kennykb) Summary: -NaN? Initial Comment: An external function returns a Tcl_NewDouble-object with NaN. In Tcl this values is printed as -NaN. I think the minus '-' should be omitted. ---------------------------------------------------------------------- >Comment By: Kevin B KENNY (kennykb) Date: 2010-11-29 08:47 Message: Why do we have to change it? The existing behaviour is *not* new - it's been in the code (and test suite - in binary.test) since before 8.5.0. The tests in util.test, it turns out, are a duplication of the ones in binary.test (They should be in both places - the one is checking that the result is formatted correctly, the other that [binary] doesn't error out returning it.) The format used by the code for non-canonical (I won't say "non-conforming": IEEE754 provides for them) NaN's exactly follows that used by David Gay's 'dtoa.c' cited in the TIP as the source of the algorithms. (The code was rewritten because it was not up to Tcl's usual coding standard.) It isn't something that the Tcl team invented. I don't understand the motivation for changing it now, years after release, to something that *is* a local invention, simply because the TIP was inadvertently silent on the treatment of the sign bit of a non-canonical NaN. (Which was an editorial error, I assure you: I wrote the original draft of the TIP and simply forgot to mention the sign bit.) ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-11-29 05:02 Message: OK, thanks! >it's probably unfixable until 9.0 Maybe, maybe not. Anything can be fixed when the TCT agrees about it. The way I understand it is that it is important for you (== kevin), that different internal representations lead to different string representations. How about changing the string representation of "-NaN" to "NaN(-ffffffffffffffff)". Then it is immediately clear that we are dealing with a non-conforming NaN here. Advantage: the confusion that lead to this issue is gone, and matching against it is easier (just "NaN*" suffices). It would be a potential incompatibility for people using negative NaN's, and the parser needs to be extended to handle an additinal sign character. But it's a relatively small change. This could be an intermediate solution towards 9.0, if no-one complains about it, we know whether it is safe to remove the '-'-sign altogether. I'll see if I can come up with a patch, as a base for further discussion. Again, Thanks! ---------------------------------------------------------------------- Comment By: Kevin B KENNY (kennykb) Date: 2010-11-28 18:11 Message: My opinion is that the omission of the minus sign from the discussion in TIP 247 was an editorial error in the TIP. And these tests are actually changing nothing: in fact, the tests in binary.test (binary-63.* and binary-64.*) were already also testing handling of NaN, and at least one -NaN appears in those tests as well. Nevertheless, since there's considerable controversy about this, I think it's probably unfixable until 9.0, because of course any editorial mistake in a TIP has to have a corresponding bug in the software if the mistake is not caught until general release. And this now can't be changed without introducing a script-level incompatibility. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-11-10 06:09 Message: B.T.W. I am NOT suggesting to change the parser mentioned in TIP #249 (realizing that someone reading my previous comment might deduce that). TIP #132 mentiones: 10. The number parser detailed in TIP #249 will be adopted into the Tcl internals. See TIP #249 for details on the implications. So we all voted for that and it's accepted. My prevous patch modified the tostring output, adding a "-" there eventually, but that doesn't match the TIP #249 description, so it is clearly wrong. Here is a new patch, open for discussion! There are only 3 tests which make an assumption about the NaN string representation. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-11-10 04:46 Message: OK, I'll explain why I think that Christoph is right. The relevant TIP here is TIP #249: Unification of Tcl's Parsing of Numbers The implentation came from the kennykb-numerics-branch, which came with Tcl 8.5: 2005-05-10 Kevin Kenny <ke...@ac...> Merged all changes on kennykb-numerics-branch back into the HEAD. TIP's 132 and 232 are now Final. This TIP #249 states: 8. The constants, "Inf", and "Infinity" (perhaps with a leading signum) are interpreted as infinities. Infinity is represented as tclDoubleType. 9. The constant "NaN" is the IEEE "Not a Number" value. It is specifically permitted in the parser so that binary format q NaN and similar calls can produce NaN on an external medium. The presence of NaN in expressions, or in Tcl_GetDoubleFromObj, signals an error. NaN is represented as tclDoubleType. 10. IEEE floating point does not have a single unique NaN value, so a NaN may be augmented by a parenthesized string of hexadecimal digits, which will be stored in its least significant bits. It shall not be possible to construct signalling NaN by this route; only quiet NaN will be supported. NaN is represented as tclDoubleType. Nothing wrong here, except that nowhere is mentioned that NaN's may be negative as well as Inf. The only place it can be derived that NaN's may be negative is the state diagram, but I would argue that in this respect, the state diagram does not match the text. How does libc it? Some C-code: double d; memcpy(&d, "\000\000\000\000\000\000\370\177", 8); printf("%g\n", d); memcpy(&d, "\000\000\000\000\000\000\370\377", 8); printf("%g\n", d); memcpy(&d, "\001\000\000\000\000\000\370\177", 8); printf("%g\n", d); memcpy(&d, "\001\000\000\000\000\000\370\377", 8); printf("%g\n", d); On Windows XP (mingw), this prints: 1.#QNAN -1.#QNAN 1.#QNAN -1.#QNAN On Linux (Ubuntu 10.4, AMD64) nan -nan nan -nan In Java Double d = Double.longBitsToDouble(0x7ff8000000000000L); System.out.println(d); d = Double.longBitsToDouble(0xfff8000000000000L); System.out.println(d); d = Double.longBitsToDouble(0x7ff8000000000001L); System.out.println(d); d = Double.longBitsToDouble(0xfff8000000000001L); System.out.println(d); This prints NaN NaN NaN NaN So, what should we follow? The C way, which is non-portable (as in Tcl 8.4). Or the Java way, which doesn't have a -NaN? Currently we use a mixture, which is not well described in TIP #249 More opinions? ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-11-09 09:14 Message: >JAN PLEASE DO NOT GO FIXING BUGS IN OTHER PEOPLE'S AREAS WITHOUT >COMMUNICATING!!! That's why I only added an implicit test-case, and using this issue for comminucation. Looks like you are the right candidate for making a decision about this. ---------------------------------------------------------------------- Comment By: Kevin B KENNY (kennykb) Date: 2010-11-09 09:11 Message: NOT SO! NaN and -NaN are intentionally printed separately. The former is 7ffffffffffffff, the latter is ffffffffffffffff. We also print NaN(hexstring) for 7ff0000000000001 - 7ffffffffffffffe and -NaN(hexstring) for fff0000000000001 - fffffffffffffffe I don't believe this is a bug. JAN PLEASE DO NOT GO FIXING BUGS IN OTHER PEOPLE'S AREAS WITHOUT COMMUNICATING!!! ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-11-09 08:50 Message: After some experimenting, I added a testcase (binary-40.5) which reproduces this. I think that Christoph is right. Here is my proposed fix (moving the minus sign between brackets after the "Nan", and only displaying it for 'invalid' NaN's) ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-11-08 10:01 Message: What bit pattern is at issue? (There are many 32-bit patterns within the "NaN" collection of values). What are the platform details? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3105247&group_id=10894 |