Menu

#41 CC can`t parse defines with Doxygen single-line comment

Next_Release
fixed
ollydbg
Undefined
2016-01-30
2014-09-01
Alatar
No

I'm using following syntax in my code

///Data types
#define    TYPE_NOT_USED        0   //!< not used
#define    TYPE_1           1   //!< comment1
#define    TYPE_2           2   //!< comment2
#define    TYPE_3           3   //!< comment3

If "parse documentation" enabled only first one appeared in list when I type "TYPE_". If I use C-like comment syntax (/!< comment /) CC parse it correctly.

Discussion

  • ollydbg

    ollydbg - 2014-09-02

    I can confirm this bug.

     
  • ollydbg

    ollydbg - 2014-09-02
    • labels: --> CodeCompletion
     
  • ollydbg

    ollydbg - 2014-09-02
    wxString Tokenizer::ReadToEOL(bool nestBraces, bool stripUnneeded)
    {
        if (stripUnneeded)
        {
            TRACE(_T("%s : line=%u, CurrentChar='%c', PreviousChar='%c', NextChar='%c', nestBrace(%d)"),
                  wxString(__PRETTY_FUNCTION__, wxConvUTF8).wc_str(), m_LineNumber, CurrentChar(),
                  PreviousChar(), NextChar(), nestBraces ? 1 : 0);
    
            static const size_t maxBufferLen = 4094;
            wxChar buffer[maxBufferLen + 2];
            wxChar* p = buffer;
            wxString str;
    
            // loop all the physical lines in reading macro definition
            for (;;)
            {
                // this while statement end up in a physical EOL '\n'
                while (NotEOF() && CurrentChar() != _T('\n'))
                {
                    while (SkipComment())
                        ;
    
                    const wxChar ch = CurrentChar();
                    if (ch == _T('\n'))
                        break;
    

    with doc reading enabled, I see that after SkipComment(), the CurrentChar() is not \n. But with doc reading disabled, the CurrentChar() is \n, which works correctly.

     
  • ollydbg

    ollydbg - 2014-09-02

    The following patch should read the doc correctly, but note that the Token and the document are not synchronized correctly, because the macro definition token is added after reading the whole comments.

     
  • ollydbg

    ollydbg - 2014-09-03

    I found that there are some issues when handling comments. Especially that C++ and C comments are handled differently.

    When C comment is handled, the stop char is here

    /* xxx yyy zzz */
                     ^  m_TokenIndex point to the char after the "/"
    

    When C++ comment is handled, the stop char is "\n"

    As a reference, the C comment is prefered if you use them in a big macro definition cross several lines, see: http://complete-concrete-concise.com/programming/c/how-to-add-comments-to-macros

     
  • Alatar

    Alatar - 2014-09-03

    Thanks! Looks like this patch working for me.

    I know about using single-line comments in multi-line macro, but here is single-line macroses and all tested compilers handle this syntax ok.

     
  • ollydbg

    ollydbg - 2014-09-04

    FYI: I commit the fix to partially fix this issue in rev9905.

     
  • Morten MacFly

    Morten MacFly - 2015-02-07
    • Type: --> Undefined
     
  • White-Tiger

    White-Tiger - 2015-06-02

    since the doxygen comment parsing is also part of the CC, another issue still exists, if doxygen comments are to be displayed (CC->Documentation->Parse documentation) , CC will show the line comment for the next define only.
    That is, TYPE_NOT_USED shows "comment1",
    TYPE_1 shows "comment2",
    TYPE_3 doesn't show anything as no other define/comment follows...

     
  • ollydbg

    ollydbg - 2015-06-02

    Hi, White-Tiger, as I said in previous posts(also in the commit 9905's log message), the token and the comment synchronization issue is not solved yet. I can't find a way to solve this issue, because the comment parsing and token parsing are basically independent, so they don't know much about each other.

     
  • White-Tiger

    White-Tiger - 2015-06-02

    well... I've only seen this issue with defines, other types such as enums, variables, function declarations etc., all work fine..
    So what's so special about defines? Why are they handled differently?

     
  • ollydbg

    ollydbg - 2015-06-02

    For example, here are the code:

    int a; //<! description of a

    Here, the parser first find a Token "int a;", and store it in the TokenTree.
    Then it find a comment "description of a", also it detected that "//<!" which means this is a kind of doxygen comment, which should append to the previous variable. In this case those snippet is running (inside the SkipComment())

            if (lineToAppend >= 0) // we have document after the token place
            {
                if (m_LastTokenIdx != -1)
                    m_TokenTree->AppendDocumentation(m_LastTokenIdx, m_NextTokenDoc + doc);
    
                m_NextTokenDoc.clear();
            }
    

    Note that m_LastTokenIdx points to Token "int a;".

    Now, in OP's example:

        #define    TYPE_NOT_USED        0   //!< not used
        #define    TYPE_1           1   //!< comment1
        #define    TYPE_2           2   //!< comment2
        #define    TYPE_3           3   //!< comment3
    

    The macro definition Token "TYPE_NOT_USED" is recorded AFTER reading the "not used" comments. So, that "not used" text is not attached to the Token "TYPE_NOT_USED". Next, when "comment1" is read, the m_LastTokenIdx points to Token "TYPE_NOT_USED"(Token TYPE_1 is not recorded yet), so "comment1" attaches to Token "TYPE_NOT_USED".

    The solution could be: stop the parser and record the macro Token if the parser see a C++ kind of comments.

     
  • ollydbg

    ollydbg - 2015-06-03

    Guys, here is the patch to solve this issue, I just stop the reading of the macro definition when the parser see a C++ style comments.

     src/plugins/codecompletion/parser/tokenizer.cpp | 7 +++++++
     1 file changed, 7 insertions(+)
    
    diff --git a/src/plugins/codecompletion/parser/tokenizer.cpp b/src/plugins/codecompletion/parser/tokenizer.cpp
    index 73125f1..ff748a1 100644
    --- a/src/plugins/codecompletion/parser/tokenizer.cpp
    +++ b/src/plugins/codecompletion/parser/tokenizer.cpp
    @@ -463,6 +463,13 @@ wxString Tokenizer::ReadToEOL(bool nestBraces, bool stripUnneeded)
                 // this while statement end up in one physical EOL '\n'
                 while (NotEOF() && CurrentChar() != _T('\n'))
                 {
    +
    +                // a macro definition has ending C++ comments, we should stop the parsing before
    +                // the "//" chars, so that the doxygen document can be added correctly, see
    +                // discussion https://sourceforge.net/p/codeblocks/tickets/41/
    +                if(CurrentChar() == _T('/') && NextChar() == _T('/'))
    +                    break;
    +
                     // Note that SkipComment() function won't skip the '\n' after comments
                     while (SkipComment())
                         ;
    

    I put the patch here for several days, so you can test it, if no issues, I will commit it.

     
  • ollydbg

    ollydbg - 2015-06-03
    • assigned_to: ollydbg
     
  • White-Tiger

    White-Tiger - 2015-06-03

    Not that happy about that patch.. as it only handles C++ style comments and thus isn't a full solution to the problem. Just a mere "better than nothing"

    while I tried to understand what the parser is doing (thanks to your explanation), I've found that there are more cases than just defines... such as:

    
    
    int GotNoDoxygenComments()
    {
        return 1;
    }
    
    int DoxyTestFunc() /**< 1st */
    { /**< 2nd */
        return 2;
    } /**< 3rd */
    

    The 1st and 2nd comment will be added to GotNoDoxygenComments()
    and DoxyTestFunc() is missing them...
    And I guess the only valid doxygen comment would be "1st"... while I guess it's also ok to do

    int DoxyTestFunc() { /**< text */
    

    So it's basically the same issue that exists for defines, you'll try to fully parse the tokens before adding them... which means to also parse the function block after function definition.

    from my understanding, the parser should add the token once it hits either ";={", that is a char that ends the definition.
    in case of "{", the parser would have to go "backward", ignoring spaces and comments and start parsing comments from there on
    For #define this would be anything after the word following "define", so basically on first space hit.

    Another solution would be if the parser got something like "TokenFound" so that comments can be processed (/*** */ onces will be added to the newly found token, /**< */ onces will be added to the previous token, but preferably only if they are on the same line)

     
    • ollydbg

      ollydbg - 2015-06-03

      Hi, thanks for the suggestion.

      int GotNoDoxygenComments()
      {
          return 1;
      }
      
      int DoxyTestFunc() /**< 1st */
      { /**< 2nd */
          return 2;
      } /**< 3rd */
      

      The 1st and 2nd comment will be added to GotNoDoxygenComments()
      and DoxyTestFunc() is missing them...
      And I guess the only valid doxygen comment would be "1st"... while I guess it's also ok to do

      int DoxyTestFunc() { /**< text */
      

      first, I think people rarely write doxygen text like the above samples. Although it is a valid code. I just read the code about handling Functions, I see that the function body is normally skipped by

      SkipBlock(); // skip  to matching }
      

      But SkipBlock() function will internally call SkipComment(), so doxygen comments is still added, but I see that the function Token is only added after SkipBlock(), which means the collect doxygen document would add to previous function Token(in your case, the GotNoDoxygenComments function Token.

      from my understanding, the parser should add the token once it hits either ";={", that is a char that ends the definition.
      in case of "{", the parser would have to go "backward", ignoring spaces and comments and start parsing comments from there on.

      I'm not sure, but for a parser, the "go backward" is quite hard to implement, because we may have "backward buffer replacement" when handling macro replacement, so we lose all the backward code context. Thus this method is hard to implement in current CC's implementation.

      Another solution would be if the parser got something like "TokenFound" so that comments can be processed (/ / onces will be added to the newly found token, /< */ onces will be added to the previous token, but preferably only if they are on the same line)

      You mean that the below code:

              if (lineToAppend >= 0) // we have document after the token place
              {
                  if (m_LastTokenIdx != -1)
                      m_TokenTree->AppendDocumentation(m_LastTokenIdx, m_NextTokenDoc + doc);
      
                  m_NextTokenDoc.clear();
              }
      

      The above "append kind comment" to "Token" association should only happens when the previous Token and comment are in the same line?

      Thanks.

       
      • White-Tiger

        White-Tiger - 2015-06-03

        weird... I wrote "onces" instead of "ones" in my previous reply xD Damn it..

        The above "append kind comment" to "Token" association should only happens when the previous Token and comment are in the same line?

        Yes, I've meant that.. but seems like it's not as easy as I thought... The only uses I've seen so far suggest that they only apply to a "token" directly in front of them.. so I assumed a new line wouldn't cut it. Yet those comments can continue on the next line: http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html#memberdoc
        So that doesn't really help either..

         
  • ollydbg

    ollydbg - 2015-06-04

    Yes, I see that the link Doxygen Manual: Documenting the code have some sample code like:

    int var; //!< Detailed description after the member
             //!<
    

    So, it can has two lines after the var definition.
    Not sure how Doxygen's parser handle those pattern...

     
  • ollydbg

    ollydbg - 2015-06-05

    My patch is committed in trunk now(r10321), and I think we can close this ticket. Though in CC there are still many things need to improve.

     
  • ollydbg

    ollydbg - 2015-06-05
    • status: open --> fixed
     
  • ollydbg

    ollydbg - 2015-08-24
    • labels: CodeCompletion --> CodeCompletion, doxygen
     
  • ollydbg

    ollydbg - 2015-08-24
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.