Hi,
In Lua, the colon syntax is used for defining methods, that is, functions that have an implicit extra parameter self. For instance:
a:b()
is the same as
a.b(self)
where self is actually "a". So this is quite commonly used.
We would like to define keywords such as "a:b", but they are being ignored in Scintilla. Notice that "a.b" works just fine.
Thanks in Advance,
Antonio Scuri
I will look into it, timeframe by end of next week. If anyone else wants to patch this please go ahead.
This sounds reasonable, but there is one or two details I want to consider. For example: If keyword matching did not find anything for "a:b" then we should try to match "a". Currently the lexer does not check "a" after an unmatched "a.b". Should we test for the "a"? Is this extra behaviour very desirable or a minor win?
I made a few tests here while trying to figure it out what was going on
and here are some results:
after setting:
keywords = a a.b a:b
a => gets colorized
a.b => gets colorized both words, the "." is also colorized in the same
color
a:b => only "a" gets colorized, the ":" is colorized in the operators color
a.c => no words get colorized, as explained by khman.
a.a => same as a.c
Thanks,
Scuri
2017-06-29 23:02 GMT-03:00 Kein-Hong Man khman@users.sf.net:
Related
Bugs:
#1952See: https://sourceforge.net/p/scintilla/code/ci/default/tree/lexers/LexLua.cxx
Identifier handling is in lines 180-217. It has been unchanged for a long time I think, I only worked around it when adding stuff like labels.
Currently, it lexes greedily, including the '.' char, then attempts to match the whole thing with a number of keyword lists. I was thinking, if "a.b" matches nothing, we can try "a" as well. Of course, all this won't be as good as truly smart editors, but it may be better than just trying to match "a.b" (and "a:b" in the future.)
Thanks for point that out. If I do this:
// if (!(setWord.Contains(sc.ch) || sc.ch == '.') || sc.Match('.',
'.')) {
if (!(setWord.Contains(sc.ch))) {
Then it behaves as expected (the period "." is not special anymore):
keywords = a a.b a:b
a => gets colorized
a.b => only "a" gets colorized, the "." is colorized in the operators color
(in a way this is more interesting than everything in keyword color)
a:b => same as a.b
a.c => same as a.b (notice that a.c is not at keywords)
a.a => both get colorized, the "." is colorized in the operators color
In all cases the "." is now colorized in the operators color.
So If I define b as a keyword too, then it will be colorized:
keywords = a a.b a:b b
a => gets colorized
a.b => both get colorized, the "." is colorized in the operators color
a:b => same as a.b (interesting!)
But "b" will also be colorized when alone or in a table with another
name, like "c.b" (in this case b gets colorized, but not c). Which is
actually interesting, because when you create a string in Lua you can do:
s = "test"
l1 = string.len(s) -- or
l2 = s:len()
Today, only string.len will get colorized. With that "if" simplification,
s:len also gets colorized in "len". Which is very interesting if you are
creating an OO approach to create objects that can call methods. But,
again, with just that modification "len" will be colorized when used alone
too. The "ideal" would be to colorize len only when it is a method (after
":") or a table field (after "."). This approach will also keep the
operators "." and ":" with the operator color.
For instance, if keywords for methods and fields were specified in a
different keyword set, those keyword would be colorized only if after a
method or field operator separator. And it simplifies all those duplicated
names in keywords, like "string string.byte string.char string.dump ".
I don't know if the "ideal" is possible in short term. But we have some
options.
Thanks,
Scuri
2017-06-30 1:34 GMT-03:00 Kein-Hong Man khman@users.sf.net:
Here is a patch. It does matching for identifier chains with dots and colons. It also does partial matching and breaks off in the case of keywords. Details inside. LexLua and test file attached. Other non-related behaviour checked, unchanged.
In the original LexLua line 324, there is a block of code that sets word styles after the main lexer loop. I deleted it, nothing seems broken when typing out keywords at the end of the file. Maybe not needed anymore? It looks like the commit for that block of code is this: https://sourceforge.net/p/scintilla/code/ci/54c153882b1673dc140d5ccae3a5ecd36b019872/
With
Its generally better not to reserve(100) unless its common for identifiers to be long. std::string uses the small-string-optimization which doesn't perform a heap allocation until the string grows larger than 15 char (in a 64-bit MSVC build, differs by library/word size).
There is one warning from cppcheck that
'c' doesn't survive the loop iteration so it can be declared inside the loop. Its a style issue that you can decide either way.
The removal of the check for keyword at end appears OK - I haven't been able to provoke bad styling.
Overthinking C++ strings. I'd get my knuckles rapped by Stroustrup (I wish). Reallocs one way or the other. Deleted them.
I'd go with avoiding warnings from cppcheck. No warnings is less stressful. :-)
I've pinged Paul K to see if he wants to weigh in on anything in this ticket, since he seems to be a party interested in improving the Lua lexer for his users.
Hi,
I tested the patch and it seems to be working. I'm still using 3.6.6, so
I just removed the LexicalClass part at the end.
All keywords are declared separated, for instance:
"table concat insert remove sort "
Is that whats expected in your patch?
Which other flexibility does your patch provide when declaring keywords?
methods can have different colors than regular functions?
Thanks,
Scuri
2018-03-06 22:16 GMT-03:00 Kein-Hong Man khman@users.sourceforge.net:
Related
Bugs:
#1952By 'keywords' I mean 'function', 'return', 'goto', etc. These cannot be mixed with normal identifiers. 'table' is a normal identifier. Hence, table.goto is illegal, table["goto"] is not illegal. Only the first keyword style has special behaviour restrictions, we reserve this for keywords. All other identifiers can mix, I think. This is what the patch implements. See the test cases. Try it out and see if you want to tweak it further.
Committed fix as [8fb85a].
Related
Commit: [8fb85a]