#13 Nested multiline-comments


Currently RSTA does not support easily programming languages that contain nested multiline comments.

Example for a nested multiline comment:

/* outer multiline comment
/* inner multiline comment */
outer multiline comment

Direct support in RSTA would be great.


  • Robert Futrell

    Robert Futrell - 2012-02-08

    I'm leaving this open only to remember to add support to TokenMakerMaker for nested comments.

    This can currently be done "manually" in an AbstractJFlexCTokenMaker subclass with some clever handling of "internal states" used for line end tokens. See PHPTokenMaker.flex for an example of this - it uses such a trick to remember what "parent" token type it's in when it encounters a "<?php" PHP start token. Here's a summary of how to go about doing it as well:

    Define an internal state for being in a multi-line comment. If you define other internal states, leave a large space between the multi-line comment one and any others. This will allow you to "encode" the current MLC depth in your end token:

    public static final int INTERNAL_IN_MLC = -1;
    public static final int INTERNAL_ANOTHER_STATE = -512;

    In the example above, you'd be able to have up to 510 levels of nested comments, for example.

    Next, in your JFlex file, in your MLC state, be sure and include the MLC depth in your end token value when encountering "\n" or <<EOF>>, and check for MLC start/end tokens to increase/decrease MLC depth. For example (untested):

    // Keep track of nested MLC depth.
    private int mlcDepth = 0;

    // ...

    // States defining start and end of an MLC.
    MlcStart = ("/*")
    MlcEnd = ("/*")

    // ...

    // ...
    // Start an MLC as usual, but note our depth of 1.
    {MlcStart} { mlcDepth = 1; start = zzMarkedPos-2; yybegin(MLC); }
    // ...

    // ...

    <MLC> {
    {MlcStart} { mlcDepth++; }
    {MlcEnd} { if (--mlcDepth==0) { addToken(start,zzStartRead+1, Token.COMMENT_MULTILINE); yybegin(YYINITIAL); } }
    <<EOF>> { addToken(start,zzStartRead+1, Token.COMMENT_MULTILINE); addEndToken(INTERNAL_IN_MLC - mlcDepth); return firstToken; }

    While likely incomplete, this example gives you the idea. Subtracting mlcDepth from the end token state allows you to then retrieve it in getTokenList() like so:

    public Token getTokenList(Segment text, int initialTokenType, int startOffset) {

    this.offsetShift = -text.offset + startOffset;
    mlcDepth = 0; // Probably not necessary, just to be safe

    // Start off in the proper state.
    int state = Token.NULL;
    switch (initialTokenType) {
    // ...
    if (initialTokenType<INTERNAL_IN_MLC && initialTokenType>INTERNAL_ANOTHER_STATE) {
    mlcDepth = -(initialTokenType - INTERNAL_IN_MLC);
    state = MLC;
    start = text.offset;

    // ...

  • Robert Futrell

    Robert Futrell - 2012-02-08
    • priority: 5 --> 3
    • assigned_to: nobody --> robert_futrell
  • Robert Futrell

    Robert Futrell - 2014-03-26
    • status: open --> closed
    • Group: --> Next Release (example)