Tcl / Read-Only Bugs / #2225 format %08x 32/64bit confusion

Jeffrey Hobbs - 2003-03-06

assigned_to: hobbs --> dkf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-07

Logged In: YES
user_id=79902

On the general issues at the end...

This is not quite as straight-forward as it appears, because
Tcl has to support building on both 64-bit and 32-bit
architectures, and we're mostly defined to use the native
machine-word size (as opposed to it being specifically one
size or another, which only really matters in a few spots.)
To support that mode of behaviour, on 64-bit architectures
the wide-int stuff is actually defined to be the same as the
int things; the #ifdefs cut out what would otherwise be
wasted code (and some compilers are almost unbelievably
picky about such stuff.) Couple that with the fact that
much 32-bit and 64-bit code is the same depending on the
platform, and you get back to where you are right now.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-07

Logged In: YES
user_id=79902

Hmm, that's odd. What platform are you using? It works for
me with 8.4.2 on IRIX64...

% tcl_ok
Test ok_1: a EQUALS b
Test ok_2: a EQUALS b
% tcl_error
Test error_1: a EQUALS b
Test error_2: a EQUALS b
% parray tcl_platform
tcl_platform(byteOrder) = bigEndian
tcl_platform(machine) = IP35
tcl_platform(os) = IRIX64
tcl_platform(osVersion) = 6.5
tcl_platform(platform) = unix
tcl_platform(user) = zzcgudf
tcl_platform(wordSize) = 8
% info patch
8.4.2

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-07

status: open --> open-works-for-me
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen Warren - 2003-03-08

Logged In: YES
user_id=728162

Here's the info on the machine I'm using:

% parray tcl_platform
tcl_platform(byteOrder) = littleEndian
tcl_platform(machine) = i686
tcl_platform(os) = Linux
tcl_platform(osVersion) = 2.4.7-10custom
tcl_platform(platform) = unix
tcl_platform(user) = swarren
tcl_platform(wordSize) = 4

% info patch
8.4.1

I also downloaded 8.4.2 and had the same issue.

(Perhaps 64-bit (native) platform isn't the issue - it's just
whether tclWideIntType is not the same as tclIntType, or
whether the wide int representation isn't the same as the
compiler's long type)

I replace the code with this and it solved my issue, although it
probably needs cleanup for the product since I wasn't very
concerned about corner cases, architecture differences etc.
etc.

---------------------------------------
switch (*format) {
case 'i':
newPtr[-1] = 'd';
case 'd':
case 'o':
case 'u':
case 'x':
case 'X':
#ifndef TCL_WIDE_INT_IS_LONG
if (Tcl_GetWideIntFromObj(interp, /* INTL: Tcl
source. */
objv[objIndex], &wideValue) != TCL_OK) {
goto fmtError;
}
if (useWide) {
whichValue = WIDE_VALUE;
}
else {
whichValue = INT_VALUE;
intValue = (wideValue & 0xffffffffU);
}
#else /* TCL_WIDE_INT_IS_LONG */
if (Tcl_GetLongFromObj(interp, /* INTL: Tcl
source. */
objv[objIndex], &intValue) != TCL_OK) {
goto fmtError;
}
whichValue = INT_VALUE;
#endif /* TCL_WIDE_INT_IS_LONG */
... as before
---------------------------------------

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-09

Logged In: YES
user_id=79902

Hmm. Note to self: need to make sure that the format
doesn't enforce one (or the other) kind of internal
representation of the numeric value, perhaps using similar
techniques to those used in TclExecuteByteCode?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-14

status: open-works-for-me --> open-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-03-14

Logged In: YES
user_id=79902

Your suggestion causes other non-obvious faults as it forces
values to become wides all the time. What we really want to
do is to peek for an existing type and work with that if it
is there. Alas, there's no pretty way to do this; you have
to peek the current type of the values directly to know what
the right course of action is...

Fixed in the 8.4 branch. Will close once fix is in the HEAD too.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen Warren - 2003-04-18

Logged In: YES
user_id=728162

Donal,

Some more feedback on this issue now I've investigated
things a little further.

I assume that your most recent response indicates that
you've updated the format command to use
GET_WIDE_OR_INT
internally to convert any strings to integers, so that
it uses the "appropriate" internal representation.
I say this after looking at how TclExecuteByteCode does
the same thing for e.g. INST_ADD.

Then, the format code will look at what type that macro
created, and format based on that, without changing the
internal representation based on the format specification?
If so, will the useWide flag be used for %d/%x etc. any
more? I'm not sure what it would do except change default
precision, which wouldn't make sense, since the default
would just be to print the entire number?

More general comments:

I'm assuming GET_WIDE_OR_INT is intended to be used
throughout the Tcl code where conversions from string to
integer are required, and that the intent of that function
is to store the converted result into the "appropriate"
internal rep. On architectures with just long, it'll use
that. On architectures where long and wide are different,
it'll use the smallest type that can represent the
converted value correctly.

I'm still unclear why having both TclIntType and
TclWideIntType is a good idea. Your previous responses
state that updating format to always convert to wides
would cause problems. Can you expand on this? I would
have thought the best implementation would have been
something like:

#if long_is_native_64_bit_type
typedef long TclIntType;
#else if something_else_provides_64_bits
typedef something_else TclIntType
#else
typedef int TclIntType
#endif

That way, there's only ever one internal representation
on integers, that always provides the maximal "capacity"
that the underlying CPU/toolchain/OS supports.

There appears to be a problem with the current design
where multiple types are used. If I have two variables
setup as shown below:

set a 0x7fffffff
set b 0x80000000

then if I interpret these as integers somewhere, the
internal rep for a becomes TclIntType, whereas the
internal rep for b becomes TclWideIntType.

Now, there's no problem with this from the point of view
of a Tcl script writing provided this is completely
transparent to them. However, there are subtle
side-effects that are visible from Tcl scripts.

For example, see the following script:

--------------------
set a 0x7fffffff
set b [expr $a + 1]

set c 0x80000000

puts "b: [format %d $b] [format %ld $b] [format %x $b]
[format 0x%lx $b]"
puts "c: [format %d $c] [format %ld $c] [format %x $c]
[format 0x%lx $c]"
--------------------

On my system (same as platform info in a previous
response below), this produces the output below (using
a copy of Tcl without the hack I mentioned in a previous
response to this bug report)

--------------------
b: -2147483648 -2147483648 80000000 0xffffffff80000000
c: -2147483648 2147483648 80000000 0x80000000
--------------------

Given that I didn't tell Tcl which storage size to use
(I don't think there's any way to do that right?) and
given I know Tcl is capable of storing 64bit integers
on my platform, I would expect both b and c to contain
the exact same value, or at least appear that way in
any Tcl operation, such as formatting or equality tests.

I see three possible fixes for this integer overflow
problem.:

a) Update INST_ADD (and much other code) to cover all
the corner cases requiring promotion of the internal
type. This would probably be a lot of work -
complicated and error-prone.

b) Update all Tcl internal math to use wides, then
back-convert to int/long if it'll fit on architectures
where they're different. I'm not sure what the point
of having two types would be then.

c) Get rid of the two types and just have one.

On top of this, the Tcl code has a huge amount of
conditional code to support the two different internal
reps of integer values. The code could be *drastically*
simpler with a single maximal-size representation -
fewer bugs and easier maintenance!

From a general philosphical perspective on how/why Tcl
deals with platforms where long isn't 64bit but something
else is, can you comment on all this so I understand why
everything is the way it is, and if there's anything I
can do to write my Tcl code so it behaves more like I
expect it to in the code sample above.

Having just found TIP72, I see there is some supposed
need for a 32-bit type to remain for backwards
compatibility. However, I don't know that anything ever
stated that Tcl integers were categorically 32-bit before
this TIP was implemented. Indeed, the code looks like
integers were 32-bit or 64-bit depending on what the
underlying CPU/toolchain/OS supported as "long".

As such, I contend that any extensions etc. that rely on
int being 32-bit are broken and I would rather see *that*
code be fixed, instead of massively complicate the Tcl
code to continue supporting this incorrect extension
coding.

I don't honestly see how Tcl can be easily 100% fixed
with the current dual-type structure - and even if it is
fixed now, it's just too easy to accidentally introduce
more problems of this exact same type in the future.

Thanks for any comments!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Don Porter - 2003-04-18

Logged In: YES
user_id=80530

Is this report still valid with
the current HEAD (after
commit of the Patch for
Tcl Bug 713562) ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen Warren - 2003-04-18

Logged In: YES
user_id=728162

Donal,

Some more feedback on this issue now I've investigated
things a little further.

I assume that your most recent response indicates that
you've updated the format command to use
GET_WIDE_OR_INT
internally to convert any strings to integers, so that
it uses the "appropriate" internal representation.
I say this after looking at how TclExecuteByteCode does
the same thing for e.g. INST_ADD.

Then, the format code will look at what type that macro
created, and format based on that, without changing the
internal representation based on the format specification?
If so, will the useWide flag be used for %d/%x etc. any
more? I'm not sure what it would do except change default
precision, which wouldn't make sense, since the default
would just be to print the entire number?

More general comments:

I'm assuming GET_WIDE_OR_INT is intended to be used
throughout the Tcl code where conversions from string to
integer are required, and that the intent of that function
is to store the converted result into the "appropriate"
internal rep. On architectures with just long, it'll use
that. On architectures where long and wide are different,
it'll use the smallest type that can represent the
converted value correctly.

I'm still unclear why having both TclIntType and
TclWideIntType is a good idea. Your previous responses
state that updating format to always convert to wides
would cause problems. Can you expand on this? I would
have thought the best implementation would have been
something like:

#if long_is_native_64_bit_type
typedef long TclIntType;
#else if something_else_provides_64_bits
typedef something_else TclIntType
#else
typedef int TclIntType
#endif

That way, there's only ever one internal representation
on integers, that always provides the maximal "capacity"
that the underlying CPU/toolchain/OS supports.

There appears to be a problem with the current design
where multiple types are used. If I have two variables
setup as shown below:

set a 0x7fffffff
set b 0x80000000

then if I interpret these as integers somewhere, the
internal rep for a becomes TclIntType, whereas the
internal rep for b becomes TclWideIntType.

Now, there's no problem with this from the point of view
of a Tcl script writing provided this is completely
transparent to them. However, there are subtle
side-effects that are visible from Tcl scripts.

For example, see the following script:

--------------------
set a 0x7fffffff
set b [expr $a + 1]

set c 0x80000000

puts "b: [format %d $b] [format %ld $b] [format %x $b]
[format 0x%lx $b]"
puts "c: [format %d $c] [format %ld $c] [format %x $c]
[format 0x%lx $c]"
--------------------

On my system (same as platform info in a previous
response below), this produces the output below (using
a copy of Tcl without the hack I mentioned in a previous
response to this bug report)

--------------------
b: -2147483648 -2147483648 80000000 0xffffffff80000000
c: -2147483648 2147483648 80000000 0x80000000
--------------------

Given that I didn't tell Tcl which storage size to use
(I don't think there's any way to do that right?) and
given I know Tcl is capable of storing 64bit integers
on my platform, I would expect both b and c to contain
the exact same value, or at least appear that way in
any Tcl operation, such as formatting or equality tests.

I see three possible fixes for this integer overflow
problem.:

a) Update INST_ADD (and much other code) to cover all
the corner cases requiring promotion of the internal
type. This would probably be a lot of work -
complicated and error-prone.

b) Update all Tcl internal math to use wides, then
back-convert to int/long if it'll fit on architectures
where they're different. I'm not sure what the point
of having two types would be then.

c) Get rid of the two types and just have one.

On top of this, the Tcl code has a huge amount of
conditional code to support the two different internal
reps of integer values. The code could be *drastically*
simpler with a single maximal-size representation -
fewer bugs and easier maintenance!

From a general philosphical perspective on how/why Tcl
deals with platforms where long isn't 64bit but something
else is, can you comment on all this so I understand why
everything is the way it is, and if there's anything I
can do to write my Tcl code so it behaves more like I
expect it to in the code sample above.

Having just found TIP72, I see there is some supposed
need for a 32-bit type to remain for backwards
compatibility. However, I don't know that anything ever
stated that Tcl integers were categorically 32-bit before
this TIP was implemented. Indeed, the code looks like
integers were 32-bit or 64-bit depending on what the
underlying CPU/toolchain/OS supported as "long".

As such, I contend that any extensions etc. that rely on
int being 32-bit are broken and I would rather see *that*
code be fixed, instead of massively complicate the Tcl
code to continue supporting this incorrect extension
coding.

I don't honestly see how Tcl can be easily 100% fixed
with the current dual-type structure - and even if it is
fixed now, it's just too easy to accidentally introduce
more problems of this exact same type in the future.

Thanks for any comments!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen Warren - 2003-04-18

Logged In: YES
user_id=728162

Damn. Stupid back button in the browser?!

Anyway, yes, this still happens on the latest CVS HEAD
assuming that's what I get if I type this:

cvs -z3 -
d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/tcl co tcl

The script given in my initial comment, followed by the one in
my latest comment, now print this (i.e. unchanged)

tcl_patchLevel: 8.5a0
tcl_platform(byteOrder) = littleEndian
tcl_platform(machine) = i686
tcl_platform(os) = Linux
tcl_platform(osVersion) = 2.4.7-10smp
tcl_platform(platform) = unix
tcl_platform(user) = swarren
tcl_platform(wordSize) = 4
Test ok_1: a EQUALS b
Test ok_2: a EQUALS b
Test error_1: a EQUALS b
Test error_2: a NOT EQUAL b
b: -2147483648 -2147483648 80000000 0xffffffff80000000
c: -2147483648 2147483648 80000000 0x80000000

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-05-14

status: open-fixed --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-05-14

Logged In: YES
user_id=79902

What an unpleasant type demotion bug this has been.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephen Warren - 2003-09-19

Logged In: YES
user_id=728162

Hmmm. This bug is closed, but I don't see any comments
indicating why. Is it fixed in a new release, or was it
accidentally closed?

Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2003-09-19

Logged In: YES
user_id=79902

It's closed because I fixed it. :^)
The fix went in in time for 8.4.3, so any version of Tcl
from then on will have it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

format %08x 32/64bit confusion

The Tool Command Language implementation

Group

Searches

Help

#2225 format %08x 32/64bit confusion

=========================================

=========================================

=========================================

=========================================

Discussion