C preprocessor highlighting bug if a variable is not defined.
NOTE:
If A is not defined, A is set to zero in #if expressions, i.e. #if f(A) defaults to #if f(0).
Tested against SciTE 3.7.5, wscite on Windows 10.
Examples:
//define A // A is nowhere defined!
#if defined A
"This is not highlighted" // correct
#endif
#if defined A && A > 0
"This is highlighted" // wrong
#endif
#if defined A && A
"This is highlighted" // wrong
#endif
#if A
"This is not highlighted" // correct
#endif
#if A == 0 // equivalent to 0 == 0
"This is highlighted" // correct
#endif
#if A > 0 // equivalent to 0 > 0
"This is highlighted" // wrong
#endif
Neil, here a patch on the issue and a little bit beyond covering the below where steps are chopped to commit pieces:
!sSeparate question after focussing on the parser feature only: Is there a way to feed a list of ex-ante defines (like those defined by the compiler itself) into the lexer?
Hope the patch is helpful after it passed several checks here. Just let me know.
Thanks,
/J.
Updated patch (including commits of the previous patch):
In my post here: Lexcpp: allow white space when reading hte macro definition - Google Groups
I'm trying adding macro definition to scintilla, but it looks like scintilla does not support such definition
While for scintilla, you should gives a string like "mydef(x,y)=x + y"
Because scintilla will cut the definition by spaces.
There are 8 change sets now so this is difficult to evaluate. Some appear wrong. For example, the second change set only remembers the first token as the value of a prreprocessor definition so may give bad results with a multi-token definition such as
Change set 7 is doing something with \r\n but its unclear what since no example is shown.
The C++ lexer has added style metadata for 4.0.0 (releasing in 2 days) so the line numbers no longer match up.
The 5th set of keywords are used to predefine macros. Keyword lists are first split by space so that means the macros themselves can not contain spaces.
Not sure how you usually deal with updates, since I thought it would be good to apply the usual practice to chop things into logical tiny pieces. If this is not feasible for you, simply take the whole patch and run tests (I haven't found a test suite in the git repository I am afraid to say) against the new patch version.
True - thanks. Here I was a bit too restrictive regarding the definition string of #defines if provided. The latest patch attached remedies this.
Well, I think no example needs to be shown, since this deals with the dangling line end issue differently handled on the various platforms: On Windows, e.g., the line end is marked with two characters '\r' '\n' and the patch simply swallows the additional '\r'.
I am not sure if I am following. If you are referring to the starting point of the patches, then this is SciTe 3.7.5 as given in the first posting. Since everything is hosted here on SF and the github is just a mirror, I was hesitating posting pull-requests there, so that I better leave it with you to merge the patches into the code which should be an easy but manual exercise; with pull-requests it would certainly have been easier.
... umm. Probably this refers to my question on how to feed predefined keywords into the lexer - I'll have a look at that shortly.Does that include simply keywords or rather a keyword dictionary, i.e. pairs (keyword, definition string) like in the lexer file here?
Separately, attached an updated patch as always against SciTe 3.7.5 (incl. the remedy for the issue you raised - thanks again) and I am including a test cases showing false positives and negatives (against ScitTe 3.7.5) which are all well captured by the latest patch together with many others more (file attached too):
I would be great if this could find its way into scintilla (release date postponed to Aug 16, 2016, if I am not mistaken), since it appears that the CPP lexer would be far more useful if less buggy in terms of false positives and negatives.
BTW: I do not get any email alert from SF on the postings although I am subscribed to this thread. I happened to see your reply today. So sorry for the late reply.
The chunks are semi-independent making it difficult to understand what is required to fix the bug, as reported, and what is being added for other reasons. For example the line end change does not appear to be addressing the bug. Segmenting a change set may be appropriate if it enhances understandability but this set obscures more than it enlightens.
The 'EOL treatment' chunk changes nothing with "\r\n": line endings. Its a forward loop which terminates early when a '\r' is found. If you have an example where the patch changes behaviour then show it.
The 'index guard' chunk may be checking without need since the conditions leading to the erasures should prevent the possibility of failure. If there are examples where failure may occur then that is interesting but the code should not be made more complex without need.
Changes should use the same style of coding as the original file. Indentation of this file uses tabs and there is no space after '('.
The Scintilla repository is also on SF at https://sourceforge.net/p/scintilla/code/ci/default/tree/ . Since 4.0.0 is a complex major release, it has taken more time then most releases and the repository has diverged further from the last release making it more difficult to reconcile change sets. Continue using patch files - the pull request mechanism is unpleasant and not monitored.
Before a release, the project is frozen and only regressions are fixed. All other changes are delayed.
Keywords 5 is a set of NAME{=VALUE} preprocessor definitions - the example in SciTE's cpp.properties is
In the example, there is
This does not crash for me - there is a limit to evaluation of 100 iterations that should stop excessive recursion.
3 changes seem simple:
1. #define A -> #define A 1
2. defined A removes A before replacing defined with value
3. Not found identifier is replaced with 0
Great - here some comments on both your email response and your patch (based on the brand new version 4.0.0):
Last edit: Jannick 2017-08-18
If there are any problems with the 3 change change set I posted above, I'd much prefer to fix them before committing instead of committing and then fixing the problem with another change. Including regressions in history can complicate bug bisection and cause problems for downstreams that are cherry-picking change sets.
If tokens.size() = i + 2 then I think the invocation is malformed and should be abandoned since there should be a ')' to balance the '('.
It may be safer to give up on this invocation with:
There are lexer tests run by scintilla/tests/lexTests.py with example files in scintilla/tests/examples. The example for C++ is x.cxx and its styled form is in x.cxx.styled. If a changed lexer disagrees with x.cxx.styled, a new file x.cxx.new contains the new styled form. On Windows, a SciLexer.DLL should be in scintilla/bin and it must be 32-bit for a 32-bit build of Python and 64-bit for a 64-bit Python. On Linux this test requires a Qt/PySide build of ScintillaEditPy.
Committed 3 change sets as [89e440], [e4cfae], and [f5aeba].
Related
Commit: [89e440]
Commit: [e4cfae]
Commit: [f5aeba]