#1300 LexCPP chokes on comments inside macros

Bug
closed-fixed
Neil Hodgson
Scintilla (791)
5
2014-08-14
2012-02-13
Marko Njezic
No

Comments written inside preprocessor definitions are not ignored, breaking further styling and in some cases folding.

Example dummy text:

#define MYMACRO() \ myfunc() { \ /* comment */ \ }

Starting with comment, everything will be styled as regular code. As a result dangling '}' bracket will break further folding.

Discussion

  • Neil Hodgson
    Neil Hodgson
    2012-02-14

    • assigned_to: nobody --> nyamatongwe
    • status: open --> open-accepted
     
  • Marko Njezic
    Marko Njezic
    2012-02-14

    This problem was introduced a long time ago with change set 1142. Log message doesn't specify exact reason for the change, so I guess that it can be safely reverted. Anyone who wants to have styling inside preprocessor directives would be using "styling.within.preprocessor" option anyway.

     
  • Neil Hodgson
    Neil Hodgson
    2012-02-14

    Colouring comments in preprocessor text was a fairly early and popular change, probably from around September 2001.

    The bug report that led to 1142 was from Steve Lhomme on the old mailing list:

    I've experienced a (long time) bug in the C lexer.
    The comments after a # are not properly handled.

    Here is an example of what I get.

    #include "toto.h" /* a one line comment -> no probs */

    #include "titit.h" /* double line comment
    ->anbadone!*/

     
  • Marko Njezic
    Marko Njezic
    2012-02-15

    In that case the standard solution is to store previous state before starting preprocessor state and restoring the stored state on termination. In addition to this if lexing starts in the middle of comment, previous state should be determined by backtracking as usual. I won't be working on this, though.

     
  • Marko Njezic
    Marko Njezic
    2012-06-07

    I have been bugged by this issue once again recently and I wrote a patch (attached) that fixes handling of comments inside macros.

    Only stream comments needed special treatment. Line comments work as is, because the same character (backslash) is used for line continuation of line comments and macros.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-11

    I think it may be a mistake to keep trying to handle starting inside a line continuation as there will be other unhandled cases, particularly now with preprocessor definition tracking. It may be a reasonable simplification to always rewind to the previous non-continuation line.

     
  • Marko Njezic
    Marko Njezic
    2012-06-11

    Actually, initialization part before the main loop is not meant for handling starting inside a line continuation. It is meant for properly setting "commentInsidePreprocessor" bool flag when starting inside stream comment. Initially, I wrote it so it just backtracks to the start of the comment and simply check style of the first character in front of the comment. However, further testing has uncovered another bug in LexCPP, which I worked around by adding a check for escaped line ends in the second part of initialization code. The newly uncovered bug is that LexCPP will erroneously style non escaped line end characters as SCE_C_PREPROCESSOR, which will result in the following code not properly setting "commentInsidePreprocessor" flag:

    #define MACRO() \[CRLF]
    myfunc() {} \[CRLF]
    [CRLF]
    /* very[CRLF]
    long[CRLF]
    multiline[CRLF]
    comment */[CRLF]
    ...

    Line end characters of the third empty line will be styled as SCE_C_PREPROCESSOR, even though there is no backslash in front of them. When starting in the middle of the comment, this will result in "commentInsidePreprocessor" being set to true, which is wrong. I first though about fixing styling of the line end characters at the end of macros (so that SCE_C_PREPROCESSOR stops before the last line end), but after looking more at the code I realized that it would open another can of worms and opted to work around the issue by checking if first character in front of the comment is line end and then see if it is escaped.

    With all this said, I can't think of a code sample that would fail as the initialization code is just checking if SCE_C_PREPROCESSOR is in front of the comment and there should be no difference if it is enabled or not due to definition tracking (it's in a preprocessor state either way).

    But, I would like to have a simpler initialization code that would not to have check for escaped line ends. And to achieve that I would suggest to actually fix the styling so that the last line end inside preprocessor definition is not styled as SCE_C_PREPROCESSOR. This will also make styling more conforming to the C standard.

    Regarding your suggestion to always rewind to the previous non-continuation line, I don't see how it would help with starting in the middle the comment. Or did you mean to leave the part of the code that backtracks to the start of the comment and if the first character in front is in preprocessor state, rewind to the start of the preprocessor. If so, the resulting code would actually not be any simpler and would have to be moved to the front before other initializations that also look back.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-12

    The justification for styling preprocessor line ends in SCE_C_PREPROCESSOR was to allow the style to have the eolfilled attribute and so display preprocessor lines more distinctly.

     
  • Marko Njezic
    Marko Njezic
    2012-06-13

    Ah yes, the eol filled thingy. In that case the added workaround for checking if escaped line end is in front of the comment is the simplest solution.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-16

    There could be an additional style (or 2) for comments within preprocessor statements. A secondary benefit of this would be to allow such comments to be styled in a way that fitted in with preprocessor statements better. Users that don't want comments to disrupt the preprocessor style could choose the same attributes as for SCE_C_PREPROCESSOR. Others may just want to match the background colour of SCE_C_PREPROCESSOR.

     
  • Marko Njezic
    Marko Njezic
    2012-06-16

    Sure, that would also work. But depending on what types of comments you want to support, you may need even four additional styles for comments inside macros - SCE_C_COMMENT, SCE_C_COMMENTDOC, SCE_C_COMMENTDOCKEYWORD, SCE_C_COMMENTDOCKEYWORDERROR. Alternate versions for the last two styles are only needed if you want to allow users to match background color completely (otherwise documentation keywords will use the standard color). Similarly, alternate version of SCE_C_COMMENTDOC is needed only if you want to allow documentation comments inside macros.

    Depending on the amount of additional styles, the resulting code changes might end up being more significant than my patch. While the benefits on the other hand will be very small - IMO matching colors does not justify the amount of additional code that would be required.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-22

    Committed change 4195 that adds a single new style SCE_C_PREPROCESSORCOMMENT. Documentation comments do not appear to be commonly used within preprocessor directives so new styles were not added for them.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-22

    • status: open-accepted --> open-fixed
     
  • Marko Njezic
    Marko Njezic
    2012-06-24

    You should also update cpp.properties file from SciTE to include the definition of newly added style.

     
  • Neil Hodgson
    Neil Hodgson
    2012-06-25

    OK, committed.

     
  • Neil Hodgson
    Neil Hodgson
    2012-07-14

    • status: open-fixed --> closed-fixed