Menu

#379 CPD C grammar problems

open
nobody
None
5
2014-08-28
2005-10-25
No

Thx to Jarkko Hietaniemi for the report!

=========================
(1) In VMS C it is legal to have '$' in identifiers, e.g.

            if (!((__vmssts = sys$delprc(&proc,0))

& 1)) {

or

              $DESCRIPTOR(msgdsc,msg);

(also seen in code for another rarer OS, VOS)

(2) String contant lines ending with backslashes

            DEBUG_o( Perl_deb(aTHX_ "Resolving

method %"SVf256\ "' for overloaded%s' in
package `%.256s'\n",
GvSV(gv), cp,
HvNAME(stash)) );

where SVf256 is

    #define SVf "_"
    #define SVf256 ".256"SVf

or

Perl_croak(aTHX_ "suidperl is no longer needed

since the kernel can
now execute\n\
setuid perl scripts securely.\n");

(3) Not recognizing the "\xHH" notation?

    if (SvCUR(TARG) == 0 ||

!is_utf8_string((U8*)tmps, SvCUR(TARG)) ||
memEQ(tmps, "\xef\xbf\xbd\0", 4)) {

dies on the memEQ constant string argument, or

             char *t0 = "\xcc\x88\xcc\x81";

========================

Discussion

  • Tom Judge

    Tom Judge - 2006-01-06

    Logged In: YES
    user_id=416200

    I have submitted a patch that fixes the multi line string
    literal with lines ending \, it is patch request: 1398501.

     
  • Tom Copeland

    Tom Copeland - 2006-04-11

    Logged In: YES
    user_id=5159

    Updating with current list of problems:

    ================
    (1) the syntax for hexadecimals in string literals is NOT
    "\0xabcd", but instead "\xabcd"! Ditto for character literals.
    (2) adding "LL" for long longs (didn't feel like adding
    another token type since I don't know what that would entail
    so instead hitch a ride on the longs)
    (3) "L" is not a valid suffix on float literals!
    ================

    and the patch from Jarkko:

    ================
    --- etc/grammar/cpp.jj.dist 2006-04-09
    16:57:44.000000000 +0300
    +++ etc/grammar/cpp.jj 2006-04-09 17:37:44.000000000 +0300
    @@ -284,26 +284,25 @@
    TOKEN [IGNORE_CASE] :
    {
    < OCTALINT : "0" (["0"-"7"])* >
    -| < OCTALLONG : <octalint> "l" >
    -| < UNSIGNED_OCTALINT : <octalint> "u" >
    -| < UNSIGNED_OCTALLONG : <octalint> ("ul" | "lu") >
    +| < OCTALLONG : <octalint> ("l")? >
    +| < UNSIGNED_OCTALINT : <octalint> ("u")? >
    +| < UNSIGNED_OCTALLONG : <octalint> ("ul" | "lu" | "ull" |
    "llu" )? ></octalint></octalint></octalint></octalint></octalint></octalint>

    | < DECIMALINT : ["1"-"9"] (["0"-"9"])* >
    -| < DECIMALLONG : <decimalint> ["u","l"] >
    -| < UNSIGNED_DECIMALINT : <decimalint> "u" >
    -| < UNSIGNED_DECIMALLONG : <decimalint> ("ul" | "lu") >
    -
    -
    -| < HEXADECIMALINT : "0x" (["0"-"9","a"-"f"])+ >
    -| < HEXADECIMALLONG : <hexadecimalint> (["u","l"])? >
    -| < UNSIGNED_HEXADECIMALINT : <hexadecimalint> "u" >
    -| < UNSIGNED_HEXADECIMALLONG : <hexadecimalint> ("ul" |
    "lu") >
    +| < DECIMALLONG : <decimalint> ("l")? >
    +| < UNSIGNED_DECIMALINT : <decimalint> ("u")? >
    +| < UNSIGNED_DECIMALLONG : <decimalint> ("ul" | "lu" |
    "ull" | "llu")? >
    +
    +| < HEXADECIMALINT : "0x" (["0"-"9","a"-"f","A"-"F"])+ >
    +| < HEXADECIMALLONG : <hexadecimalint> ("l")? >
    +| < UNSIGNED_HEXADECIMALINT : <hexadecimalint> ("u")? >
    +| < UNSIGNED_HEXADECIMALLONG : <hexadecimalint> ("ul" |
    "lu" | "ull" | "llu")? ></hexadecimalint></hexadecimalint></hexadecimalint></decimalint></decimalint></decimalint></hexadecimalint></hexadecimalint></hexadecimalint></decimalint></decimalint></decimalint>

    | < FLOATONE : ((["0"-"9"])+ "." (["0"-"9"]) |
    (["0"-"9"])
    "." (["0"-"9"])+)
    - ("e" (["-","+"])? (["0"-"9"])+)? (["f","l"])? >
    + ("e" (["-","+"])? (["0"-"9"])+)? (["f"])? >

    -| < FLOATTWO : (["0"-"9"])+ "e" (["-","+"])? (["0"-"9"])+
    (["f","l"])? >
    +| < FLOATTWO : (["0"-"9"])+ "e" (["-","+"])? (["0"-"9"])+
    (["f"])? >
    }

    TOKEN :
    @@ -318,7 +317,7 @@
    |
    ["1"-"9"] (["0"-"9"])
    |
    - ("0x" | "0X") (["0"-"9","a"-"f","A"-"F"])+
    + ("x" | "X") (["0"-"9","a"-"f","A"-"F"])+
    )
    )
    )
    @@ -333,7 +332,7 @@
    |
    ["1"-"9"] (["0"-"9"])

    |
    - ("0x" | "0X") (["0"-"9","a"-"f","A"-"F"])+
    + ("x" | "X") (["0"-"9","a"-"f","A"-"F"])+
    )
    )
    )*
    ================

     
  • Tom Copeland

    Tom Copeland - 2006-04-19

    Logged In: YES
    user_id=5159

    Clarification for # 1:

    ===================
    Maybe a clarification to this that inside string and character
    constants the rule is "\x" followed by ONE OR TWO hexadecimal
    digits and or octals, it is ONE TO THREE octal constants:

        "\xa"   string one chars long (plus the terminating \0)
        "\xAB"  string one chars long (...)
        "\xaby" string two chars long
        '\xa'   one char
        '\xAB'  one char
        '\xABy' ERROR
        "\7"    string one chars long (plus ...)
        "\123"  string one chars long
        "\123y" string two chars long
        '\7'    one char
        '\123'  one char
        '\123y' ERROR
    

    My example of \xabcd is a bit misleading: depending on the
    C compiler it either parses the 'ab' as the character '\xab'
    and then the characters 'c' and 'd', or it warns or throws
    an error that it cannot fit 0xabcd into a char.
    ===================

     
  • Tom Copeland

    Tom Copeland - 2006-06-27

    Logged In: YES
    user_id=5159

    Last test file (maybe):

    include <stdio.h></stdio.h>

    int main() {
    printf("s = [%s]\n", "foo"\
    "bar");
    return 0;
    }

     

Log in to post a comment.