[Tcl-bugs] [ tcl-Bugs-3105247 ] -NaN?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Bugs item #3105247, was opened at 2010-11-08 07:56
Message generated for change (Comment added) made by kennykb
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3105247&group_id=10894

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: 48. Number Handling
Group: development: 8.6b1.1
>Status: Open
Resolution: Invalid
Priority: 5
Private: No
Submitted By: Christoph Bauer (fridolin)
Assigned to: Kevin B KENNY (kennykb)
Summary: -NaN?

Initial Comment:
An external function returns a Tcl_NewDouble-object with NaN. In Tcl this values is printed as -NaN.

I think the minus '-' should be omitted.

----------------------------------------------------------------------

>Comment By: Kevin B KENNY (kennykb)
Date: 2010-12-15 12:35

Message:
That gets really awkward really fast. Even loading a signalling NaN into a
floating point register to pass it as a parameter or result value of a
function causes a process crash on some platforms, so adopting a
'signalling NaN-safe' pipeline would be a lot of work (make sure all
doubles that could be signalling NaNs are exchanged by pointers to the
memory that stores them and operated on only by integer operations, at
least until the possibility of a signal has been eliminated).

Masking the exception on x86 causes the NaN silently to be converted to a
quiet one when it's loaded, so passing a signalling NaN to Tcl_PrintDouble
will get one of two results: a crash if the exception is unmasked, or a
quiet NaN if it is masked.

It seems to me that sNaN handling would yield comparatively little benefit
for considerable development and maintenance cost.

----------------------------------------------------------------------

Comment By: SourceForge Robot (sf-robot)
Date: 2010-12-15 12:20

Message:
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-12-09 04:02

Message:
So, Tcl is broken already?  I repeat my last question:

Should - in stead -
Tcl start to represent signalling NaN's as "sNaN"
(as allowed, but not required by IEEE 754) ???

I don't think this threatens the EIAS principle, as
there is only one NaN (or ...  two: signalling and
quiet), as clearly indicated by IEEE 754. -NaN and
NaN can be considered two different string
representations of the same thing, just as 0
and 0x0 are, just as +NaN is the same as NaN.

----------------------------------------------------------------------

Comment By: Alexandre Ferrieux (ferrieux)
Date: 2010-12-09 03:54

Message:
I admit I've not followed this closely, but  IMHO the answer to "Is the
round-trip with shimmering important?" is:
 Yes !!! Any breakage of the EIAS principle, however tiny, threatens the
whole castle.

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-12-09 03:08

Message:
dkf wrote:
>we wouldn't care except we want to keep the general interconvertibility
of
>doubles with strings so that we can pass them back out again even in the
>face of shimmering (e.g., if some nitwit uses [llength] on them, it
>shouldn't break the round trip rules).

Just found out that that currently is not the case: Tcl
does not have signaling NaN's, they are silently
converted to quiet NaN's. This is even mentioned in
the TIP. Here is a test case showing that:

test binary-63.5 {NaN roundtrip} ieeeFloatingPoint {
    binary scan [binary format w 0x7ff0000000000001] q d
    binary scan [binary format q $d] w w
    llength $d; #force shimmering
    binary scan [binary format q $d] w r
    list $d [format 0x%016lx $w] [format 0x%016lx $r]
} {NaN(1) 0x7ff0000000000001 0x7ff8000000000001}

So, in a normal roundtrip, the given bit pattern is converted
to NaN(1), and back with the same result. When shimmering,
bit 51 is magically set, changing the signaling NaN to a
quiet NaN.

This opens up (in my opion) to handle the sign bit the same
way as the signaling bit: Provide we decide that the sign bit
has no meaning, just as the signaling bit.

I don't think this is in any way in conflict with
TIP #249, nor with IEEE 754, but I would like
to hear other opinions about this. Is the round-trip
with shimmering important? Should - in stead -
Tcl start to represent signalling NaN's as "sNaN"
(as allowed, but not required by IEEE 754) ???

Here is my last patch (I hope) still open for discussion.

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-12-02 06:02

Message:
>Setting status to "pending - invalid"

Well, I already spent too much time on this, and forced the same to some
other people.
Sorry about that. Here my conclusion, for the interested.

For more information, See:
<http://www.validlab.com/754R/drafts/archive/2006-10-04.pdf>

After reading this (and the wiki about it), I come to the same
conclusion:

	>A NaN may also carry a payload, intended for diagnostic information
indicating
	>the source of the NaN. The sign of a NaN has no meaning, but it may
	>be predictable in some circumstances.
	...
	>Conversion of a quiet NaN in internal format to an external character
sequence
	>shall produce a language defined one of “nan” or a sequence that is
equivalent
	>except for case (e.g., “NaN”), with an optional preceding sign.

	>Languages should provide an optional conversion of NaNs in internal
format to
	>external character sequences that appends to the basic NaN character
sequences
	>a suffix that can represent the NaN payload (see 8.2). The form and
interpretation
	>of the payload suffix is language defined. The language should require
that any such
	>optional output sequences be recognized as input in conversion of
external character
	>sequences to internal formats.

So, there is only one NaN, the extra bits (payload) and the sign are
optional. External
software should not generate a negative sign, since it has no meaning it
should have
used the payload for that. But the parser is required to accept an
optional payload as
valid NaN! This means, the Tcl numeric parser is conforming to IEEE754.
Changing that,
would interpret the sign as part of the payload, which is not a good
idea.

Regards,
       Jan Nijtmans

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2010-12-01 11:48

Message:
Setting status to "pending - invalid" - if nobody comments before the bug
is closed automatically, I'll remove the 'controversial' condition from the
proposed test cases.

----------------------------------------------------------------------

Comment By: Jeffrey Hobbs (hobbs)
Date: 2010-11-29 12:26

Message:
It's not clear the value in changing the existing coded behavior, as it
clearly works to separate 2 binary reps.  What value would there be in
changing what that rep would be to the OP?  Go from -NaN to NaN(-ff..)?

----------------------------------------------------------------------

Comment By: Donal K. Fellows (dkf)
Date: 2010-11-29 11:37

Message:
IMO, there's two cases to consider:
  * externally-sourced NaNs
  * internally-sourced NaNs

In the former case, they should survive a trip in (via [binary scan]) and
out again (via [binary format]) without any change to their bit pattern,
and while in Tcl should be understood to work like a NaN by being not
numerically-equal to anything else; like that, if we get one from outside
because of a blunder elsewhere, we can just pass it on correctly.

In the latter case, we should endeavor to ensure that they are trapped as
errors at as early an opportunity as possible; this is indeed the case with
NaN values that pass through [expr]. Outside expressions, they're just
strings.

Overall, that means neither Tcl_NewDoubleObj nor Tcl_GetDoubleFromObj has
to not choke on them but Tcl_ExprObj must not like them, which is current
behavior. The actual string format of NaN-values isn't nearly so important;
we wouldn't care except we want to keep the general interconvertibility of
doubles with strings so that we can pass them back out again even in the
face of shimmering (e.g., if some nitwit uses [llength] on them, it
shouldn't break the round trip rules).

Given all the above, I don't care too much about changing things in this
area and I trust kbk to get things right. I particularly trust him to deal
with corrections to the TIP, since I believe that considerations w.r.t. the
sign of NaN were not foremost in our minds when the decision was taken!

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2010-11-29 08:50

Message:
Oh, also:  "matching against it is easier"...  I usually use

if {$x != $x} { ... }

as the test for NaN.  Simpler than any string match and it doesn't
shimmer.

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2010-11-29 08:47

Message:
Why do we have to change it?

The existing behaviour is *not* new - it's been in the code (and test
suite - in binary.test) since before 8.5.0. The tests in util.test, it
turns out, are a duplication of the ones in binary.test (They should be in
both places - the one is checking that the result is formatted correctly,
the other that [binary] doesn't error out returning it.) The format used by
the code for non-canonical (I won't say "non-conforming": IEEE754 provides
for them) NaN's exactly follows that used by David Gay's 'dtoa.c' cited in
the TIP as the source of the algorithms. (The code was rewritten because it
was not up to Tcl's usual coding standard.) It isn't something that the Tcl
team invented.

I don't understand the motivation for changing it now, years after
release, to something that *is* a local invention, simply because the TIP
was inadvertently silent on the treatment of the sign bit of a
non-canonical NaN. (Which was an editorial error, I assure you: I wrote the
original draft of the TIP and simply forgot to mention the sign bit.)

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-11-29 05:02

Message:
OK, thanks!

>it's probably unfixable until 9.0
Maybe, maybe not. Anything can be fixed when the TCT agrees about
it. The way I understand it is that it is important for you (== kevin),
that
different internal representations lead to different string
representations.

How about changing the string representation of "-NaN" to
"NaN(-ffffffffffffffff)". Then it is immediately clear that we are
dealing with a non-conforming NaN here. Advantage: the
confusion that lead to this issue is gone, and matching against
it is easier (just "NaN*" suffices). It would be a potential
incompatibility for people using negative NaN's, and the
parser needs to be extended to handle an additinal sign
character. But it's a relatively small change.

This could be an intermediate solution towards 9.0,
if no-one complains about it, we know whether it is
safe to remove the '-'-sign altogether.

I'll see if I can come up with a patch, as a base
for further discussion.

Again, Thanks!

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2010-11-28 18:11

Message:
My opinion is that the omission of the minus sign from the discussion in
TIP 247 was an editorial error in the TIP. 

And these tests are actually changing nothing: in fact, the tests in
binary.test (binary-63.* and binary-64.*) were already also testing
handling of NaN, and at least one -NaN appears in those tests as well.

Nevertheless, since there's considerable controversy about this, I think
it's probably unfixable until 9.0, because of course any editorial mistake
in a TIP has to have a corresponding bug in the software if the mistake is
not caught until general release. And this now can't be changed without
introducing a script-level incompatibility.

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-11-10 06:09

Message:
B.T.W. I am NOT suggesting to change the parser mentioned in
TIP #249 (realizing that someone reading my previous comment
might deduce that). TIP #132 mentiones:

  10. The number parser detailed in TIP #249 will be adopted into the Tcl
internals. See TIP #249 for details on the implications.

So we all voted for that and it's accepted.

My prevous patch modified the tostring output, adding  a "-" there
eventually, but that
doesn't match the TIP #249 description, so it is clearly wrong.

Here is a new patch, open for discussion!

There are only 3 tests which make an assumption about the NaN
string representation.

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-11-10 04:46

Message:
OK, I'll explain why I think that Christoph is right.

The relevant TIP here is
 TIP #249: Unification of Tcl's Parsing of Numbers

The implentation came from the kennykb-numerics-branch, which came
with Tcl 8.5:
	2005-05-10  Kevin Kenny  <ke...@ac...>
	Merged all changes on kennykb-numerics-branch back into the HEAD.
	TIP's 132 and 232 are now Final.

This TIP #249 states:
 8. The constants, "Inf", and "Infinity" (perhaps with a leading signum)
are interpreted
	as infinities. Infinity is represented as tclDoubleType.
 9. The constant "NaN" is the IEEE "Not a Number" value. It is
specifically
	permitted in the parser so that binary format q NaN and similar calls
	can produce NaN on an external medium. The presence of NaN in
expressions,
	or in Tcl_GetDoubleFromObj, signals an error. NaN is represented as
tclDoubleType.
10. IEEE floating point does not have a single unique NaN value, so a NaN
may be
	augmented by a parenthesized string of hexadecimal digits, which will be
stored
	in its least significant bits. It shall not be possible to construct
signalling
	NaN by this route; only quiet NaN will be supported. NaN is represented
as tclDoubleType.

Nothing wrong here, except that nowhere is mentioned that NaN's may be
negative as well as Inf. The
only place it can be derived that NaN's may be negative is the state
diagram, but I would
argue that in this respect, the state diagram does not match the text.

How does libc it? Some C-code:
	double d;
	memcpy(&d, "\000\000\000\000\000\000\370\177", 8);
	printf("%g\n", d);
	memcpy(&d, "\000\000\000\000\000\000\370\377", 8);
	printf("%g\n", d);
	memcpy(&d, "\001\000\000\000\000\000\370\177", 8);
	printf("%g\n", d);
	memcpy(&d, "\001\000\000\000\000\000\370\377", 8);
	printf("%g\n", d);
On Windows XP (mingw), this prints:
	1.#QNAN
	-1.#QNAN
	1.#QNAN
	-1.#QNAN
On Linux (Ubuntu 10.4, AMD64)
	nan
	-nan
	nan
	-nan
In Java
	Double d = Double.longBitsToDouble(0x7ff8000000000000L);
        System.out.println(d);
	d = Double.longBitsToDouble(0xfff8000000000000L);
        System.out.println(d);
	d = Double.longBitsToDouble(0x7ff8000000000001L);
        System.out.println(d);
	d = Double.longBitsToDouble(0xfff8000000000001L);
        System.out.println(d);
This prints
NaN
NaN
NaN
NaN

So, what should we follow? The C way, which is non-portable (as in Tcl
8.4). Or
the Java way, which doesn't have a -NaN? Currently we use a mixture, which
is not
well described in TIP #249

More opinions?

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-11-09 09:14

Message:
>JAN PLEASE DO NOT GO FIXING BUGS IN OTHER PEOPLE'S AREAS WITHOUT
>COMMUNICATING!!!

That's why I only added an implicit test-case, and using this issue for
comminucation.

Looks like you are the right candidate for making a decision about this.

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2010-11-09 09:11

Message:
NOT SO!

NaN and -NaN are intentionally printed separately.

The former is 7ffffffffffffff, the latter is ffffffffffffffff.

We also print NaN(hexstring) for 7ff0000000000001 - 7ffffffffffffffe and
-NaN(hexstring) for fff0000000000001 - fffffffffffffffe

I don't believe this is a bug. 

JAN PLEASE DO NOT GO FIXING BUGS IN OTHER PEOPLE'S AREAS WITHOUT
COMMUNICATING!!!

----------------------------------------------------------------------

Comment By: Jan Nijtmans (nijtmans)
Date: 2010-11-09 08:50

Message:
After some experimenting, I added a testcase (binary-40.5) which reproduces
this.

I think that Christoph is right. Here is my proposed fix (moving the minus
sign between
brackets after the "Nan", and only displaying it for 'invalid' NaN's)

----------------------------------------------------------------------

Comment By: Don Porter (dgp)
Date: 2010-11-08 10:01

Message:
What bit pattern is at issue?  (There are many
32-bit patterns within the "NaN" collection of
values).  What are the platform details?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3105247&group_id=10894

[Tcl-bugs] [ tcl-Bugs-3105247 ] -NaN?

The Tool Command Language implementation

[Tcl-bugs] [ tcl-Bugs-3105247 ] -NaN?