From: SourceForge.net <no...@so...> - 2003-04-18 18:49:18
|
Bugs item #699060, was opened at 2003-03-06 23:09 Message generated for change (Comment added) made by srwarren You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=699060&group_id=10894 Category: 11. Conversions from String Group: = 8.4.1 Status: Open Resolution: Fixed Priority: 5 Submitted By: Stephen Warren (srwarren) Assigned to: Donal K. Fellows (dkf) Summary: format %08x 32/64bit confusion Initial Comment: I have found a bug in Tcl 8.4.1 for platforms with 64-bit integer support enabled, i.e. where "#ifndef TCL_WIDE_INT_IS_LONG" is true. The problem is that the format internal function treats integers (%08x format) as 32-bit, and updates object's internal "cache" value to be tclIntType instead of tclWideIntType. The docs for format say: # The l modifier is ignored for real values and on 64-bit platforms, # which are always converted as if the l modifier were present (i.e. # the types double and long are used for the internal representation # of real and integer values, respectively). If the h modifier is # specified then integer values are truncated to short before # conversion. Both h and l modifiers are ignored on all other # conversions. but I can get format to use a 32-bit internal representation on a 64-bit system, which contradicts the docs (the docs match my idea of what should be happening.) The code to reproduce the issue is as follows: ========================================= =================== proc tcl_ok {} { set a 0xaaaaaaaa set b_1 0xaaaa set b_2 aaaa set b "${b_1}${b_2}" if {$a != $b} { puts "Test ok_1: a NOT EQUAL b" } else { puts "Test ok_1: a EQUALS b" } set x [format %08lx $a] if {$a != $b} { puts "Test ok_2: a NOT EQUAL b" } else { puts "Test ok_2: a EQUALS b" } } proc tcl_error {} { set a 0xaaaaaaaa set b_1 0xaaaa set b_2 aaaa set b "${b_1}${b_2}" if {$a != $b} { puts "Test error_1: a NOT EQUAL b" } else { puts "Test error_1: a EQUALS b" } set x [format %08x $a] # format has changed $a to int internal representation # b has no type, so is interpreted as wide_int # a is sign-extended 32->64 to match b's size # hence this comparison fails if {$a != $b} { puts "Test error_2: a NOT EQUAL b" } else { puts "Test error_2: a EQUALS b" } } puts $tcl_patchLevel tcl_ok tcl_error ========================================= =================== which prints: ========================================= =================== 8.4.1 Test ok_1: a EQUALS b Test ok_2: a EQUALS b Test error_1: a EQUALS b Test error_2: a NOT EQUAL b ========================================= =================== The issue is the code in tclCmdAH.c in function Tcl_FormatObjCmd (at line 2187 in Tcl 8.4.1) This says: ========================================= =================== #ifndef TCL_WIDE_INT_IS_LONG if (useWide) { if (Tcl_GetWideIntFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &wideValue) != TCL_OK) { goto fmtError; } whichValue = WIDE_VALUE; size = 40 + precision; break; } #endif /* TCL_WIDE_INT_IS_LONG */ if (Tcl_GetLongFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &intValue) != TCL_OK) { goto fmtError; } ========================================= =================== But it should say something more like: ========================================= =================== #ifndef TCL_WIDE_INT_IS_LONG if (Tcl_GetWideIntFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &wideValue) != TCL_OK) { goto fmtError; } if (useWide) { whichValue = WIDE_VALUE; size = 40 + precision; break; } else { value_used_for_formatting = wide_value & 0xffffffff // setup other formatting params here? } #else /* TCL_WIDE_INT_IS_LONG */ if (Tcl_GetLongFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &intValue) != TCL_OK) { goto fmtError; } #endif /* TCL_WIDE_INT_IS_LONG */ ========================================= =================== i.e. on 64-bit architectures, Tcl should *always* interpret strings as 64-bit numbers, but follow the wording of the docs on the format h/l modifiers more precisely by truncating 64->32/16 bits depending on whether h/l are specified, instead of interpreting the number as 64/32/(16?) depending on whether h/l are specified. On a more general note, a lot of the Tcl code says this: #if HAVE_A_64_BIT_TYPE do_something_64 bail out, or whatever #endif do_something_32 This leads to this type of bug. It'd be better if everything said this: #if HAVE_A_64_BIT_TYPE do_something_64 #else do_something_32 #endif any_common_code Plus, shouldn't the tclIntType be completely removed if 64-bit numbers are enabled? Or, make tclIntType a #define/whatever to tclInt32Type or tclInt64Type, so that there's never ever a possibility of using the wrong internal representation for an integer. ---------------------------------------------------------------------- >Comment By: Stephen Warren (srwarren) Date: 2003-04-18 18:49 Message: Logged In: YES user_id=728162 Donal, Some more feedback on this issue now I've investigated things a little further. I assume that your most recent response indicates that you've updated the format command to use GET_WIDE_OR_INT internally to convert any strings to integers, so that it uses the "appropriate" internal representation. I say this after looking at how TclExecuteByteCode does the same thing for e.g. INST_ADD. Then, the format code will look at what type that macro created, and format based on that, without changing the internal representation based on the format specification? If so, will the useWide flag be used for %d/%x etc. any more? I'm not sure what it would do except change default precision, which wouldn't make sense, since the default would just be to print the entire number? More general comments: I'm assuming GET_WIDE_OR_INT is intended to be used throughout the Tcl code where conversions from string to integer are required, and that the intent of that function is to store the converted result into the "appropriate" internal rep. On architectures with just long, it'll use that. On architectures where long and wide are different, it'll use the smallest type that can represent the converted value correctly. I'm still unclear why having both TclIntType and TclWideIntType is a good idea. Your previous responses state that updating format to always convert to wides would cause problems. Can you expand on this? I would have thought the best implementation would have been something like: #if long_is_native_64_bit_type typedef long TclIntType; #else if something_else_provides_64_bits typedef something_else TclIntType #else typedef int TclIntType #endif That way, there's only ever one internal representation on integers, that always provides the maximal "capacity" that the underlying CPU/toolchain/OS supports. There appears to be a problem with the current design where multiple types are used. If I have two variables setup as shown below: set a 0x7fffffff set b 0x80000000 then if I interpret these as integers somewhere, the internal rep for a becomes TclIntType, whereas the internal rep for b becomes TclWideIntType. Now, there's no problem with this from the point of view of a Tcl script writing provided this is completely transparent to them. However, there are subtle side-effects that are visible from Tcl scripts. For example, see the following script: -------------------- set a 0x7fffffff set b [expr $a + 1] set c 0x80000000 puts "b: [format %d $b] [format %ld $b] [format %x $b] [format 0x%lx $b]" puts "c: [format %d $c] [format %ld $c] [format %x $c] [format 0x%lx $c]" -------------------- On my system (same as platform info in a previous response below), this produces the output below (using a copy of Tcl without the hack I mentioned in a previous response to this bug report) -------------------- b: -2147483648 -2147483648 80000000 0xffffffff80000000 c: -2147483648 2147483648 80000000 0x80000000 -------------------- Given that I didn't tell Tcl which storage size to use (I don't think there's any way to do that right?) and given I know Tcl is capable of storing 64bit integers on my platform, I would expect both b and c to contain the exact same value, or at least appear that way in any Tcl operation, such as formatting or equality tests. I see three possible fixes for this integer overflow problem.: a) Update INST_ADD (and much other code) to cover all the corner cases requiring promotion of the internal type. This would probably be a lot of work - complicated and error-prone. b) Update all Tcl internal math to use wides, then back-convert to int/long if it'll fit on architectures where they're different. I'm not sure what the point of having two types would be then. c) Get rid of the two types and just have one. On top of this, the Tcl code has a huge amount of conditional code to support the two different internal reps of integer values. The code could be *drastically* simpler with a single maximal-size representation - fewer bugs and easier maintenance! From a general philosphical perspective on how/why Tcl deals with platforms where long isn't 64bit but something else is, can you comment on all this so I understand why everything is the way it is, and if there's anything I can do to write my Tcl code so it behaves more like I expect it to in the code sample above. Having just found TIP72, I see there is some supposed need for a 32-bit type to remain for backwards compatibility. However, I don't know that anything ever stated that Tcl integers were categorically 32-bit before this TIP was implemented. Indeed, the code looks like integers were 32-bit or 64-bit depending on what the underlying CPU/toolchain/OS supported as "long". As such, I contend that any extensions etc. that rely on int being 32-bit are broken and I would rather see *that* code be fixed, instead of massively complicate the Tcl code to continue supporting this incorrect extension coding. I don't honestly see how Tcl can be easily 100% fixed with the current dual-type structure - and even if it is fixed now, it's just too easy to accidentally introduce more problems of this exact same type in the future. Thanks for any comments! ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2003-03-14 23:32 Message: Logged In: YES user_id=79902 Your suggestion causes other non-obvious faults as it forces values to become wides all the time. What we really want to do is to peek for an existing type and work with that if it is there. Alas, there's no pretty way to do this; you have to peek the current type of the values directly to know what the right course of action is... Fixed in the 8.4 branch. Will close once fix is in the HEAD too. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2003-03-09 22:10 Message: Logged In: YES user_id=79902 Hmm. Note to self: need to make sure that the format doesn't enforce one (or the other) kind of internal representation of the numeric value, perhaps using similar techniques to those used in TclExecuteByteCode? ---------------------------------------------------------------------- Comment By: Stephen Warren (srwarren) Date: 2003-03-08 03:59 Message: Logged In: YES user_id=728162 Here's the info on the machine I'm using: % parray tcl_platform tcl_platform(byteOrder) = littleEndian tcl_platform(machine) = i686 tcl_platform(os) = Linux tcl_platform(osVersion) = 2.4.7-10custom tcl_platform(platform) = unix tcl_platform(user) = swarren tcl_platform(wordSize) = 4 % info patch 8.4.1 I also downloaded 8.4.2 and had the same issue. (Perhaps 64-bit (native) platform isn't the issue - it's just whether tclWideIntType is not the same as tclIntType, or whether the wide int representation isn't the same as the compiler's long type) I replace the code with this and it solved my issue, although it probably needs cleanup for the product since I wasn't very concerned about corner cases, architecture differences etc. etc. --------------------------------------- switch (*format) { case 'i': newPtr[-1] = 'd'; case 'd': case 'o': case 'u': case 'x': case 'X': #ifndef TCL_WIDE_INT_IS_LONG if (Tcl_GetWideIntFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &wideValue) != TCL_OK) { goto fmtError; } if (useWide) { whichValue = WIDE_VALUE; } else { whichValue = INT_VALUE; intValue = (wideValue & 0xffffffffU); } #else /* TCL_WIDE_INT_IS_LONG */ if (Tcl_GetLongFromObj(interp, /* INTL: Tcl source. */ objv[objIndex], &intValue) != TCL_OK) { goto fmtError; } whichValue = INT_VALUE; #endif /* TCL_WIDE_INT_IS_LONG */ ... as before --------------------------------------- ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2003-03-07 11:25 Message: Logged In: YES user_id=79902 Hmm, that's odd. What platform are you using? It works for me with 8.4.2 on IRIX64... % tcl_ok Test ok_1: a EQUALS b Test ok_2: a EQUALS b % tcl_error Test error_1: a EQUALS b Test error_2: a EQUALS b % parray tcl_platform tcl_platform(byteOrder) = bigEndian tcl_platform(machine) = IP35 tcl_platform(os) = IRIX64 tcl_platform(osVersion) = 6.5 tcl_platform(platform) = unix tcl_platform(user) = zzcgudf tcl_platform(wordSize) = 8 % info patch 8.4.2 ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2003-03-07 11:16 Message: Logged In: YES user_id=79902 On the general issues at the end... This is not quite as straight-forward as it appears, because Tcl has to support building on both 64-bit and 32-bit architectures, and we're mostly defined to use the native machine-word size (as opposed to it being specifically one size or another, which only really matters in a few spots.) To support that mode of behaviour, on 64-bit architectures the wide-int stuff is actually defined to be the same as the int things; the #ifdefs cut out what would otherwise be wasted code (and some compilers are almost unbelievably picky about such stuff.) Couple that with the fact that much 32-bit and 64-bit code is the same depending on the platform, and you get back to where you are right now. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=699060&group_id=10894 |