#784 Strange problem with preprocessor

closed-fixed
5
2007-01-23
2004-06-30
No

Hi,

Trying to compile the following:
---
#define LO_B(x) ((x) & 0xff)
char main()
{
unsigned char a=0xfe-LO_B(3);
return a;
}
---

I get:
---
wrdef.c:4: warning: function 'LO_B' implicit declaration
wrdef.c:4: error: too many parameters
wrdef.c:4: warning: function 'LO_B' implicit declaration
wrdef.c:4: error: indirections to different types
assignment
from type 'void'
to type 'unsigned-char'
---

I don't understand what's going wrong. Looking into an
sdcpp output, I can see the macro LO_B was not expanded,
but I can't figure out why. Any ideas?
Test-case is attached.

Discussion

  • Stas Sergeev

    Stas Sergeev - 2004-06-30

    test-case

     
  • Raphael Neider

    Raphael Neider - 2006-11-23

    Logged In: YES
    user_id=1115835
    Originator: NO

    This is in fact a bug, still present in SDCC r4478: The preprocessor recognizes the definition of LO_B(x), but fails to recognize the use in
    0xfe-LO_B(3)
    as '0xfe', '-', and 'LO_B(3)'. Inserting whitespace before the '-' as in
    0xfe -LOB(3)
    solves this.

    Turned into bug report and leaving this open for the time being.

    Regards,
    Raphael

     
  • Borut Ražem

    Borut Ražem - 2006-12-19
    • assigned_to: nobody --> borutr
     
  • Borut Ražem

    Borut Ražem - 2006-12-19

    Logged In: YES
    user_id=568035
    Originator: NO

    I tried it with:
    cpp (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125),
    cpp (GCC) 4.1.1 20061011 (Red Hat 4.1.1-30),
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8804 for 80x86 (VC 6.0),
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86 (VC .NET 2003),
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 (VC 8.0),
    and the result is the same - the macro LO_B(3) is not expanded.

    Borland C++ Win32 Preprocessor 5.5.1 Copyright (c) 1993, 2000 Borland,
    Open Watcom C32 Optimizing Compiler Version 1.6
    do it in the expected way - the macro LO_B(3) is expanded.

    I made a (very) quick search in the C standard and I didn't find anything which justifies the macro not to be expanded, so it is probably a bug. I'll try to fix it and post the bug/patch to GCC team.

    Borut

     
  • Borut Ražem

    Borut Ražem - 2006-12-19

    Logged In: YES
    user_id=568035
    Originator: NO

    I found out that the problem is only when the last character in hex constant is 'e'. It seems that the preprocessor treats the letter 'e' as the exponent delimiter and then it gets confused...

    Borut

     
  • Stas Sergeev

    Stas Sergeev - 2006-12-20

    Logged In: YES
    user_id=501371
    Originator: YES

    > I found out that the problem is only when the last character in hex
    > constant is 'e'. It seems that the preprocessor treats the letter 'e' as
    > the exponent delimiter
    That was actually the reason why I put it into a support
    requests, not into a bugs - I also found that getting rid
    of 'e' avoids the problem.
    Do you still think this is a bug? I only know that it gave
    me a _lot_ of a headache, so I am inclinced to classify
    it as a bug. :)

    > I'll try to fix it and post the bug/patch to GCC team.
    Is the sdcpp code share some parts with gcc-cpp? Just
    wondering.

     
  • Borut Ražem

    Borut Ražem - 2006-12-20

    Logged In: YES
    user_id=568035
    Originator: NO

    > That was actually the reason why I put it into a support
    > requests, not into a bugs - I also found that getting rid
    > of 'e' avoids the problem.
    I found out that if 'e' is followed by an operator different from '-' or '+', it works. This is an additional indication that 'e' is treated as an exponent delimiter.

    > Do you still think this is a bug? I only know that it gave
    > me a _lot_ of a headache, so I am inclinced to classify
    > it as a bug. :)
    By my opinion it is a bug since 0x... indicates that the number is of integral type and shouldn't accept the exponent part.

    > Is the sdcpp code share some parts with gcc-cpp? Just
    > wondering.
    sccpp is derivated from GCC cpp. See http://sdcc.sourceforge.net/release_wiki/index.php?page=SDCPP+history

    Borut

     
  • Borut Ražem

    Borut Ražem - 2006-12-20
    • labels: --> C Preprocessor
    • status: open --> pending-fixed
     
  • Borut Ražem

    Borut Ražem - 2006-12-20

    Logged In: YES
    user_id=568035
    Originator: NO

    After looking to the current implementation in sdcpp and the "6.4.8 Preprocessing numbers" chapter in C99 standard it seems that this is not an implementation bug: the implementation follows the standard.

    My opinion is that this is a bug in the C99 standard, since the "identifier-nondigit" (including all letters) is treated as a part of the number (see definition of "identifier-nondigit" at "6.4.2.1 General"): the preprocessor treats "0xfe-LO_B" as a number!?

    I think that the preprocessor and compiler should have the same (or at least very similar) "idea" what a number is.

    I "fixed" the problem in revision #4520 by replacing "identifier-nondigit" with "hexadecimal-digit" and "hexadecimal-prefix" (see 6.4.4.1 Integer constants):

    pp-number:
    hexadecimal-prefix
    digit
    . digit
    pp-number hexadecimal-prefix
    pp-number digit
    pp-number hexadecimal-digit
    pp-number e sign
    pp-number E sign
    pp-number p sign
    pp-number P sign
    pp-number .

    I would like to hear the other opinions. We can still decide to revert the fix (or introduce a sdcpp command line switch and/or pragma)...

    Borut

     
  • Stas Sergeev

    Stas Sergeev - 2006-12-21
    • status: pending-fixed --> open-fixed
     
  • Stas Sergeev

    Stas Sergeev - 2006-12-21

    Logged In: YES
    user_id=501371
    Originator: YES

    Thanks for your research. After all the headache
    with it, I really want an answer on whether it is
    a bug or just me. :)

    > preprocessor treats "0xfe-LO_B" as a number!?
    > I think that the preprocessor and compiler should have the same (or at
    > least very similar) "idea" what a number is.
    Just a quick test shows that gcc (a compiler)
    accepts 0x1e5 as a number (seems to contradict
    with what you say, if I understand correctly).
    I'll try to follow your other pointers soon.

     
  • Borut Ražem

    Borut Ražem - 2006-12-21

    Logged In: YES
    user_id=568035
    Originator: NO

    > Just a quick test shows that gcc (a compiler)
    > accepts 0x1e5 as a number (seems to contradict
    > with what you say, if I understand correctly).

    I don't see why it is in contradictory with what I wrote. It is not in contradictory with what I meant ;-)

    Anyway, after sleeping through, I found out that may solution doesn't solve the situation when the macro starts with a hex letter [A-Fa-f] :-(
    The lexical analyzer should not accept the exponent part if the number is hexadecimal or octal (it doesn't start with 0, 0X or 0).

    Borut

     
  • Borut Ražem

    Borut Ražem - 2006-12-21
    • status: open-fixed --> open
     
  • Erik Petrich

    Erik Petrich - 2006-12-21

    Logged In: YES
    user_id=635249
    Originator: NO

    Revision #4520 still has problems if the macro name uses only hexadecimal digits (rename LO_B to BAD). This looks hard to completely fix without making the pp-number grammar much more complicated.

    The problems associated with the definition of pp-number is mentioned in the 6.4.8 of the Rationale: "In the interests of keeping the description simple, occasional spurious forms are scanned as preprocessing numbers. For example, 0x123E+1 is a single token under the rules. The C89 Committee felt that it was better to tolerate such anomalies than burden the preprocessor with a more exact, and exacting, lexical specification. It felt that this anomaly was no worse than the principle under which the characters a+++++b are tokenized as a ++ ++ + b (an invalid expression), even though the tokenization a ++ + ++ b would yield a syntactically correct expression. In both cases, exercise of reasonable precaution in coding style avoids surprises."

    I guess "reasonable precaution in coding style" means that operators should be delimited by whitespace to avoid unexpected results.

     
  • Erik Petrich

    Erik Petrich - 2006-12-21
    • status: open --> pending-fixed
     
  • Stas Sergeev

    Stas Sergeev - 2006-12-21
    • status: pending-fixed --> open-fixed
     
  • Stas Sergeev

    Stas Sergeev - 2006-12-21

    Logged In: YES
    user_id=501371
    Originator: YES

    >> accepts 0x1e5 as a number (seems to contradict
    >> with what you say, if I understand correctly).
    > I don't see why it is in contradictory with what I wrote.
    I mean this part:
    ---
    By my opinion it is a bug since 0x... indicates that the number is of
    integral type and shouldn't accept the exponent part.
    ---
    I am not sure what exactly did you mean, but I simply
    tested about an acceptance of an exponent part for 0x
    and it seems to work. :)

    Doesn't seem to matter though, as the detailed explanation
    is now available. So indeed it looks like the preprocessor
    and the compiler do not agree with each other on how to
    interpret this...

     
  • David Barnett

    David Barnett - 2006-12-21

    Logged In: YES
    user_id=896846
    Originator: NO

    If I'm understanding you here, you're saying that 0x1e5 should not be accepted by the preprocessor because an "0x..." literal shouldn't have an exponent. 0x1e5 would actually not be interpreted as having an exponent because 'e' is a hex digit. 0x1e5 has a decimal value of 485.

     
  • Stas Sergeev

    Stas Sergeev - 2006-12-21

    Logged In: YES
    user_id=501371
    Originator: YES

    Oh my, had to sleep better! Sorry. Indeed the
    0x1e+5 is not accepted.

     
  • Borut Ražem

    Borut Ražem - 2006-12-23
    • status: open-fixed --> pending-fixed
     
  • Borut Ražem

    Borut Ražem - 2006-12-23

    Logged In: YES
    user_id=568035
    Originator: NO

    An other try to fix it: introduced -pedantic-parse-number command line option and pedantic_parse_number pragma.

    Fixed in revision #4523.

    Still a candidate for reverting...

    Borut

     
  • SourceForge Robot

    Logged In: YES
    user_id=1312539
    Originator: NO

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 30 days (the time period specified by
    the administrator of this Tracker).

     
  • SourceForge Robot

    • status: pending-fixed --> closed-fixed
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks