Code::Blocks / Tickets / #41 CC can`t parse defines with Doxygen single-line comment

ollydbg - 2014-09-02

I can confirm this bug.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2014-09-02

labels: --> CodeCompletion
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

wxString Tokenizer::ReadToEOL(bool nestBraces, bool stripUnneeded)
{
    if (stripUnneeded)
    {
        TRACE(_T("%s : line=%u, CurrentChar='%c', PreviousChar='%c', NextChar='%c', nestBrace(%d)"),
              wxString(__PRETTY_FUNCTION__, wxConvUTF8).wc_str(), m_LineNumber, CurrentChar(),
              PreviousChar(), NextChar(), nestBraces ? 1 : 0);

        static const size_t maxBufferLen = 4094;
        wxChar buffer[maxBufferLen + 2];
        wxChar* p = buffer;
        wxString str;

        // loop all the physical lines in reading macro definition
        for (;;)
        {
            // this while statement end up in a physical EOL '\n'
            while (NotEOF() && CurrentChar() != _T('\n'))
            {
                while (SkipComment())
                    ;

                const wxChar ch = CurrentChar();
                if (ch == _T('\n'))
                    break;

with doc reading enabled, I see that after SkipComment(), the CurrentChar() is not \n. But with doc reading disabled, the CurrentChar() is \n, which works correctly.

ollydbg - 2014-09-02

The following patch should read the doc correctly, but note that the Token and the document are not synchronized correctly, because the macro definition token is added after reading the whole comments.

parse-doc-fix-v1.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2014-09-03

I found that there are some issues when handling comments. Especially that C++ and C comments are handled differently.

When C comment is handled, the stop char is here

/* xxx yyy zzz */ ^ m_TokenIndex point to the char after the "/"

When C++ comment is handled, the stop char is "\n"

As a reference, the C comment is prefered if you use them in a big macro definition cross several lines, see: http://complete-concrete-concise.com/programming/c/how-to-add-comments-to-macros
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alatar - 2014-09-03

Thanks! Looks like this patch working for me.

I know about using single-line comments in multi-line macro, but here is single-line macroses and all tested compilers handle this syntax ok.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2014-09-04

FYI: I commit the fix to partially fix this issue in rev9905.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Morten MacFly - 2015-02-07

Type: --> Undefined
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-06-02

since the doxygen comment parsing is also part of the CC, another issue still exists, if doxygen comments are to be displayed (CC->Documentation->Parse documentation) , CC will show the line comment for the next define only.
That is, TYPE_NOT_USED shows "comment1",
TYPE_1 shows "comment2",
TYPE_3 doesn't show anything as no other define/comment follows...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-06-02

Hi, White-Tiger, as I said in previous posts(also in the commit 9905's log message), the token and the comment synchronization issue is not solved yet. I can't find a way to solve this issue, because the comment parsing and token parsing are basically independent, so they don't know much about each other.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-06-02

well... I've only seen this issue with defines, other types such as enums, variables, function declarations etc., all work fine..
So what's so special about defines? Why are they handled differently?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-06-02

For example, here are the code:

int a; //<! description of a

Here, the parser first find a Token "int a;", and store it in the TokenTree.
Then it find a comment "description of a", also it detected that "//<!" which means this is a kind of doxygen comment, which should append to the previous variable. In this case those snippet is running (inside the SkipComment())

if (lineToAppend >= 0) // we have document after the token place { if (m_LastTokenIdx != -1) m_TokenTree->AppendDocumentation(m_LastTokenIdx, m_NextTokenDoc + doc); m_NextTokenDoc.clear(); }

Note that m_LastTokenIdx points to Token "int a;".

Now, in OP's example:

#define TYPE_NOT_USED 0 //!< not used #define TYPE_1 1 //!< comment1 #define TYPE_2 2 //!< comment2 #define TYPE_3 3 //!< comment3

The macro definition Token "TYPE_NOT_USED" is recorded AFTER reading the "not used" comments. So, that "not used" text is not attached to the Token "TYPE_NOT_USED". Next, when "comment1" is read, the m_LastTokenIdx points to Token "TYPE_NOT_USED"(Token TYPE_1 is not recorded yet), so "comment1" attaches to Token "TYPE_NOT_USED".

The solution could be: stop the parser and record the macro Token if the parser see a C++ kind of comments.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Guys, here is the patch to solve this issue, I just stop the reading of the macro definition when the parser see a C++ style comments.

 src/plugins/codecompletion/parser/tokenizer.cpp | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/plugins/codecompletion/parser/tokenizer.cpp b/src/plugins/codecompletion/parser/tokenizer.cpp
index 73125f1..ff748a1 100644
--- a/src/plugins/codecompletion/parser/tokenizer.cpp
+++ b/src/plugins/codecompletion/parser/tokenizer.cpp
@@ -463,6 +463,13 @@ wxString Tokenizer::ReadToEOL(bool nestBraces, bool stripUnneeded)
             // this while statement end up in one physical EOL '\n'
             while (NotEOF() && CurrentChar() != _T('\n'))
             {
+
+                // a macro definition has ending C++ comments, we should stop the parsing before
+                // the "//" chars, so that the doxygen document can be added correctly, see
+                // discussion https://sourceforge.net/p/codeblocks/tickets/41/
+                if(CurrentChar() == _T('/') && NextChar() == _T('/'))
+                    break;
+
                 // Note that SkipComment() function won't skip the '\n' after comments
                 while (SkipComment())
                     ;

I put the patch here for several days, so you can test it, if no issues, I will commit it.

ollydbg - 2015-06-03

assigned_to: ollydbg
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-06-03

Not that happy about that patch.. as it only handles C++ style comments and thus isn't a full solution to the problem. Just a mere "better than nothing"

while I tried to understand what the parser is doing (thanks to your explanation), I've found that there are more cases than just defines... such as:

int GotNoDoxygenComments() { return 1; } int DoxyTestFunc() /**< 1st */ { /**< 2nd */ return 2; } /**< 3rd */

The 1st and 2nd comment will be added to GotNoDoxygenComments()
and DoxyTestFunc() is missing them...
And I guess the only valid doxygen comment would be "1st"... while I guess it's also ok to do

int DoxyTestFunc() { /**< text */

So it's basically the same issue that exists for defines, you'll try to fully parse the tokens before adding them... which means to also parse the function block after function definition.

from my understanding, the parser should add the token once it hits either ";={", that is a char that ends the definition.
in case of "{", the parser would have to go "backward", ignoring spaces and comments and start parsing comments from there on
For #define this would be anything after the word following "define", so basically on first space hit.

Another solution would be if the parser got something like "TokenFound" so that comments can be processed (/*** */ onces will be added to the newly found token, /**< */ onces will be added to the previous token, but preferably only if they are on the same line)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- ollydbg - 2015-06-03
  
  Hi, thanks for the suggestion.
  
  int GotNoDoxygenComments() { return 1; } int DoxyTestFunc() /**< 1st */ { /**< 2nd */ return 2; } /**< 3rd */
  
  The 1st and 2nd comment will be added to GotNoDoxygenComments()
  and DoxyTestFunc() is missing them...
  And I guess the only valid doxygen comment would be "1st"... while I guess it's also ok to do
  
  int DoxyTestFunc() { /**< text */
  
  first, I think people rarely write doxygen text like the above samples. Although it is a valid code. I just read the code about handling Functions, I see that the function body is normally skipped by
  
  SkipBlock(); // skip to matching }
  
  But SkipBlock() function will internally call SkipComment(), so doxygen comments is still added, but I see that the function Token is only added after SkipBlock(), which means the collect doxygen document would add to previous function Token(in your case, the GotNoDoxygenComments function Token.
  
  from my understanding, the parser should add the token once it hits either ";={", that is a char that ends the definition.
  in case of "{", the parser would have to go "backward", ignoring spaces and comments and start parsing comments from there on.
  
  I'm not sure, but for a parser, the "go backward" is quite hard to implement, because we may have "backward buffer replacement" when handling macro replacement, so we lose all the backward code context. Thus this method is hard to implement in current CC's implementation.
  
  Another solution would be if the parser got something like "TokenFound" so that comments can be processed (/ / onces will be added to the newly found token, /< */ onces will be added to the previous token, but preferably only if they are on the same line)
  
  You mean that the below code:
  
  if (lineToAppend >= 0) // we have document after the token place { if (m_LastTokenIdx != -1) m_TokenTree->AppendDocumentation(m_LastTokenIdx, m_NextTokenDoc + doc); m_NextTokenDoc.clear(); }
  
  The above "append kind comment" to "Token" association should only happens when the previous Token and comment are in the same line?
  
  Thanks.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - White-Tiger - 2015-06-03
    
    weird... I wrote "onces" instead of "ones" in my previous reply xD Damn it..
    
    The above "append kind comment" to "Token" association should only happens when the previous Token and comment are in the same line?
    
    Yes, I've meant that.. but seems like it's not as easy as I thought... The only uses I've seen so far suggest that they only apply to a "token" directly in front of them.. so I assumed a new line wouldn't cut it. Yet those comments can continue on the next line: http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html#memberdoc
    So that doesn't really help either..
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-06-04

Yes, I see that the link Doxygen Manual: Documenting the code have some sample code like:

int var; //!< Detailed description after the member //!<

So, it can has two lines after the var definition.
Not sure how Doxygen's parser handle those pattern...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-06-05

My patch is committed in trunk now(r10321), and I think we can close this ticket. Though in CC there are still many things need to improve.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-06-05

status: open --> fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-08-24

labels: CodeCompletion --> CodeCompletion, doxygen
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-08-24

A related bug report: incorrect defines parsing with doxygen block comment

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CC can`t parse defines with Doxygen single-line comment

A free C, C++ and Fortran IDE

Milestone

Searches

Help

#41 CC can`t parse defines with Doxygen single-line comment

Discussion