#31 Integer constants: octal format

trunk
closed
Eric Bezault
gelint (5)
5
2009-03-11
2009-03-11
Eric Bezault
No

Extend gelint to support integer constants in octal format as specified in ECMA.

Discussion

  • Eric Bezault
    Eric Bezault
    2009-03-11

    Gelint was only supporting pre-ECMA integer constant formats. So I decided to update gelint to be fully compliant with ECMA and ISE for all integer constant formats (not just octal format as requested in this bug report). But I quickly realized that ECMA was missing some validity rules with that respect, and that ISE was not fully compliant with ECMA and had some inconsistent behavior because of that.

    First, I wanted to determine which validity rule would be violated if I write for example:

    i := {INTEGER_8} 3456

    and the value 3456 is not representable as an INTEGER_8. ISE reports a syntax error in that case. And in ECMA 367-2, I could not find a syntax nor a validity rule for that. In the informative text of 8.32.30, we have:

    This definition always yields a well-defined mathematical value,
    regardless of the number of digits. It is only at the level of
    Integer_constant that the value may be flagged as invalid, for
    example {NATURAL_8} 256, or 999...999 with too many digits to
    be representable as either an INTEGER_32 or an INTEGER_64.

    So I went to chapter 8.29, but I could not find such rule on Integer_constant. The best that I could find is in the informative text of 8.29.7:

    The rule states no restriction on the value, even though an
    example such as {INTEGER_8} 256 is clearly invalid, since 256
    is not representable as INTEGER_8. The Manifest Constant rule
    addresses this.

    So I jumped to this Manifest Constant rule, in 8.18.2, but it only applies to constant attributes of the form:

    my_constant: INTEGER_8 = 3456

    and not to manifest constants appearing as expression in the class text, as in my example:

    i := {INTEGER_8} 3456

    So I think that we are missing some rule in chapter 8.29. I added such a rule in gelint.

    Now, a second point that I want to mention is the type of the manifest constant when there is no manifest type prefix. We have seen above that there is a rule for that in 8.18.2 when it appears in a constant attribute. But what about if it appears elsewhere, such as in:

    i := 3456

    The only information that I could find about that is in 8.29.6, in the informative text again:

    As a consequence of cases 3 to 6, the type of a character,
    string or numeric constant is never one of the sized variants,
    but always the fundamental underlying type (CHARACTER, INTEGER,
    REAL, STRING). Language mechanisms are designed so that you can
    use such constants without hassle -- for example, without explicit
    conversions -- even in connection with specific variants. For
    example:
    * You can assign an integer constant such as 10 to a target
    of a type such as INTEGER_8 as long as it fits (as enforced
    by valid rules).
    * You can use such a constant for discrimination in a Multi_branch
    even if the expression being discriminated is of a specific
    sized variant; here too the compatibility is enforced statically
    by validity rules.

    That's a nice informative text, but where are the validity rules that it mentions, and what are those language mechanisms? In the absence of more precise information, gelint follows what is implemented in ISE. Namely, if an integer constant without a manifest type prefix appears in a position in the code where an expression of a sized variant of INTEGER is expected, then check whether the value of that constant is representable as an instance of that sized variant. If it is not representable, or if the type of the expected expression is not integer, then the type of the integer constant is INTEGER_32 (or is it the type mapped to INTEGER?) if its value is representable as an INTEGER_32, otherwise INTEGER_64 if it is representable as an INTEGER_64, otherwise NATURAL_64 if it is representable as an NATURAL_64. Otherwise a validity error should be reported. Gelint uses the same validity rule that was added above for the case with an explicit manifest type prefix. ISE reports a syntax error instead.

    Then there is the case of hexadecimal and binary formats. In the ECMA document I see nothing that makes them different from decimal or octal constants. Only the base used for the representation differs. However in ISE implementation they are different. For example, 0xFF without any further type information has value 255. However when its type is known to be INTEGER_8, such as in:

    print ({INTEGER_8} 0xFF)

    or:

    i: INTEGER_8
    i := 0xFF

    then, instead of being rejected (because 255 is not representable as an INTEGER_8), its value becomes -1. The rationale is that in that case we want to mimic C and treat the hexadecimal constant not as an integer in base 16, but as a bit representation of the number. Likewise for the binary format 0b11111111. The rationale makes sense, but the problem is that depending on the context, 0xFF will be considered as an integer in base 16, and in other cases as a bit representation of that integer, which gives different values. I think that we need to clarify that in the standard.

    Gelint has been updated to behave like ISE for the time being. But this leads to weird cases. For example I said earlier that if an integer constant is not representable as an INTEGER_32, then its type is INTEGER_64. But this becomes tricky with 0xFFFFFFFF. This number in base 16 is not representable as an INTEGER_32. So it's type is INTEGER_64. But if we had chosen to read it a bit representation then it would have been representable as an INTEGER_32 with value -1. I also wanted to see what would happen if I added a + sign, as in '{INTEGER_8} +0xFF'. In that case ISE considers that it is not representable as an INTEGER_8. So 0xFF is representable as an INTEGER_8, but +0xFF is not. Another weird behavior is shown below.

    Let's consider that we have:

    my_constant: INTEGER_16 = {INTEGER_8} 0xFF

    We have seen above that ISE considers that '{INTEGER_8} 0xFF' is valid and that its value is -1. So we would expect that the value of `my_constant' is -1. If you try that in EiffelStudio you will see that in fact the value is 255. From what I understand, the reason for that is that even though ISE checks whether 0xFF is representable as an INTEGER_8, to determine the value of the constant attribute it just ignores the manifest type prefix of '{INTEGER_8} 0xFF' and makes as if it had been declared as:

    my_constant: INTEGER_16 = 0xFF

    So I looked again at the valid rule in 8.18.2, and indeed it does not mention the Manifest_type part of the Manifest_constant. In fact, I think that the validity rule is written as if it was not syntactically possible to have a Manifest_type prefix in the Manifest_constant. The fact that it does not mention that the type of the Manifest_type prefix should somehow convert to the declared type of the constant attribute, and hence reject code like that:

    my_constant: INTEGER_8 = {INTEGER_64} 3

    is another evidence. So, I wonder whether the syntax for constant attributes, in 8.5.4 should be changed to:

    Explicit_value :== "=" Manifest_value

    and then the rule in 8.18.2 should mention Manifest_value instead of Manifest_constant.

    One last thing: when we have:

    my_constant: INTEGER_8 = 128

    ISE correctly reports a VQMC validity error, but it says that the type of the constant (which it considers to be INTEGER_32) does not match the declared type (i.e. INTEGER_8). Instead it should have said that the value 128 is not representable as an INTEGER_8 (as per VQMC-3 in 8.18.2).

     
  • Eric Bezault
    Eric Bezault
    2009-03-11

    • status: open --> closed
     
  • Eric Bezault
    Eric Bezault
    2009-03-11

    The implementation of integer constants in gelint/gec in svn#6607 should now be compatible with ISE.