Re: [TCLCORE] "binary format" accepts out-of-range-values?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

MartinLemburg@Siemens-PLM skrev 2012-09-27 11.10:
> Hi,
>
> I needed to "format" a binary string with a series of "shorts" or WORDs.
>
> The values I provided to "binary format s" were out of range, but "binary format" accepted them.
>
> I am surprised, that providing integers, that are out of range are accepted to "format" specified integer of a lower range by an implicit cast:
>
>      % set v [clock seconds]
>      1348734108
>      % format {%#x %#x} {*}[scan [set bin [binary format s $v]] {%c%c}]
>      0x9c 0xc
>      % set s [expr {$v&  0xFFFF}]
>      3228
>      % expr {$s == [binary scan $bin s s2; set s2]}
>      1
>
> This behavior is not documented, is it wanted or just a bug?

I'd bet it's being relied upon in places, at the very least to ignore the 
(un)signedness of numbers.

> And why the following:
>
>      % binary format s [clock milliseconds]
>      integer value too large to represent
>
> Is the acceptable integer range still limited to non-wide integers?

 From RTFS (function FormatNumber in tclBinary.c), the numeric value is 
indeed being parsed by TclGetLongFromObj for 32-bit integers and less.

> Why not here casting implicitly?

Because this part of the Tcl runtime library (get number from Tcl_Obj) has 
only been partially adapted to the reality of (nearly) unlimited precision 
arithmetic, and therefore tends to get edge cases wrong. Realising this has 
however been a slow process, since many big names here still seem to believe 
that integer=long (and therefore had to come up with "entier" for the normal 
meaning of integer).

Conceretly, it certainly makes sense that Tcl_GetIntFromObj, 
Tcl_GetLongFromObj, and Tcl_GetWideIntFromObj return something that fits in 
the named C types (no matter how compiler-dependent these technically are 
allowed to be) or throws a Tcl error if they can't, because chances are that 
a C funtion calling one of these functions will shortly do something with 
the value that requires it to have this specific type, and catching an error 
at that stage would be much harder. The problem is that the logic for 
can't/can (and if so how) that is applied doesn't match the needs of all 
that many use-cases (if any at all). Changing it for the existing functions 
is probably not much of an option, so what one would need is a new family of 
Tcl_Get* functions that lets the caller specify how the edge cases should be 
handled (e.g. using a flag argument).

"Ignore bits that are too high" (a.k.a. casting) is certainly appropriate in 
some cases. In others (particularly list and string indexing), one might 
rather want to clamp the result to the representable range, to avoid silly 
irregularities such as

   lindex x 1234567890

returning an empty string but

   lindex x 12345678901

throwing a syntax-looking error. There is also the issue of signed versus 
unsigned that ought to be addressed; currently Tcl only has facilities for 
getting signed numbers out of a Tcl_Obj.

Lars Hellström

Re: [TCLCORE] "binary format" accepts out-of-range-values?

The Tool Command Language implementation

Re: [TCLCORE] "binary format" accepts out-of-range-values?