In my company we use Scintilla as a text editor for source code and we want to add automatic completion and make the lexer do a more accurate job, for example to recognize user-defined variables and typedef and to correctly get context from included files.
I'm starting development and LSP seems to be 'the right' solution.
I know this can be done in the application by managing Scintilla notifications, however I am asking for collaboration if there are other people who want to integrate that support into Scintilla itself.
I don't know at the moment how to do it, perhaps as a new lexer or better as an extension of the actual lexer interface: I don't know the details of Scintilla's implementation well, and I need help in this field.
There is a Scintilla based open source (GPL) IDE supports this https://wiki.codelite.org/pmwiki.php/Main/LanguageServer
It is not an option for us:
My question is: are you (the Scintillla developers) interested in adding support for LSP to Scintilla? If yes, we can collaborate.
Last edit: Marco Lazzarotto 2019-12-09
Diff:
Supporting LSP is a large feature and its likely that some of that support will have to go into applications. Its unlikely to be a good fit completely inside Scintilla. A companion library to Scintilla may be a better approach, with some corresponding work inside Scintilla. The lexer interface is designed just for lexing so could be used to convey LSP lexing functionality into Scintilla but that won't cover other LSP features.
Scintilla's lexing functions are currently being separated out into a new Lexilla library.
For coloring, I have following idea:
let application to register a hook to classify identifier at specific position. application can use it's symbol table/manager/database or what's ever (clangd, libclang, ctags, etc.) to lookup the kind of identifier (class, interface, structure, union, function, enumeration, enumerator, constant, macro, argument, class/instance field, local variable, etc.).
This enable better coloring for scoped identifiers (argument, local variable, etc.), only small changes in Scintilla.
@zufuliu: This is what I meant :-)
@nyamatongwe: I agree the LSP support is placed in a separate and optional library, the work inside Scintilla could be:
1) Commands (
SCI_LSP_INIT*) to define, start and initialize the LSP server.2) Commands (
SCI_LSP_SET*) to setup the desired handling of informations from the server: for example, handle completion/documentation by scintilla itself, or as a notification to be handled by the application.3) Give the lexers an interface to improve coloring.
To maintain editor speed and to handle the I/O with the server I suppose the LSP client should run in a separate thread and communicate with Scintilla using an asynchronous interface.
Since Scintilla (as far as I know) does not have a message loop of his own, this is for me the biggest doubt on how to implement LSP inside Scintilla and not completely on the application level.
Since the application owns the event loop, it will be simpler to start implementation in a specific application then move functionality into Scintilla as it is understood. The scope of this feature is not defined and is difficult to define without experience.
Providing direct callbacks from lexing makes it more difficult to move lexing onto another thread. One of my long-term goals is to isolate lexing into a restrictive environment which can be easily run in a separate thread or process.
With LSP, its likely that classifying an identifier will require a quite slow cross-process call. Another approach may be to perform lexing as a streaming operation with multiple stages. The basic lexer stage determines most syntax including where identifiers occur, then a second stage patches identifiers into their classes with a batched LSP call.
My primary goal is to get autocompletion, the other important one is to improve syntax highlight.
I am going to implement the basic protocol to let the server know about the document edited state, after that I hope to have more knowledge on the problems and possible solutions.
I would like to change the lexer code as little as possible, actually the strategy I have in mind is:
Last edit: Marco Lazzarotto 2019-12-11
For asynchronous coloring, it possible can be implemented by adding a notification with document range been styled after Scintilla's lexer finished coloring (in LexInterface::Colourise). After receiving this notification, application can query it's LSP, symbol database, etc. to find out identifier informations in this range, then call a new API to change each identifier's style.
Scintilla's single 'layer' of styling could be generalised into multiple layers each with an independent
endStyledand corresponding styling update mechanism for each layer.This would be useful, for example, to perform spell check highlighting (and other linting tasks) over the document. Some layers may be dependent on others with, for example, spell checking having different rules for identifiers and comments.
One way of handling post-lexer identifier classification is to change the style bytes for the identifier ranges, another technique is to add indicators over the ranges.
Based on LSP's Document Symbols Request, there only need one call to get all symbol info in current document.
The procedures might be:
NotifyLexerFinished(startPos, startPos + length)to notify application that current range has finished lexing, application record/merge this range in it's notification handle, then apply style patches if applicable. Because of possible backtracking, NotifyLexerFinished need be called insider individual lexer (can be improved by keep initially position for LexAccessor::StartAt, then move the call into StyleContext::Complete and LexerSimple::Lex, or add a method in LexAccessor).I am working on it with some interruptions due to other work, in the days after Christmas I would like to publish some preliminary work.
Where is the best place to do it?
It may be possible to 'add attachment' on this issue with an archive or patch of the changes but you may not be permissioned for that. Try it and see.
You could publish a patch on a web site or source code control server somewhere and show its URL here.
Here it is:
https://gitlab.com/marcolazzarotto/scintilla-lsp
I am used to program with Qt, there is some work to do on other platforms to implement the class Process (launch an external process, and communicate with it through stdin/stdout)
Thanks for the implementation - it is quite interesting.
The automatic completion behaves reasonably when there is good contextual information like a visible struct definition when typing '"." after a variable name.
I suppose the use of annotations for warnings is just a demonstration and this will eventually flow through to the application. Its a bit noisy when typing code, particularly the 3 line warnings. There also appears to be a limit of 20 warnings but I couldn't find a literal 20 in the code so don't know where that is coming from. For large documents, there could be performance issues with many warnings so there may be a need to have the application decide how to pace requests.

SC_MOD_INSERTTEXT occurs in many contexts and will occur multiple times when there are multiple selections. The SCN_CHARADDED notification is more commonly used for automatic completion.
It would help here to send SCNotification messages (including SCN_CHARADDED) to the LSP code instead of using DocWatcher. This would also help isolate the LSP code from Scintilla. I feel that the implementation is currently too tightly enmeshed inside Scintilla and a looser coupling would be better.
Yes, I used annotations just to see the results. The server configuration can be improved to not have 3 lines of code (capabilities/textDocument/publishDiagnostics/relatedInformation = true) and visualization should be delegated to application.
I will also improve the way LSP server will be configured.
I added the support for signature help and plan to add support for textDocument/semanticHighlighting : if well supported from the server this could make a lexer almost useless.
To decouple LSP from Scintilla I imagine to write a client function to be called from NotifyParent() like this:
I will do it in the next days.
I think it would be better to call onNotification from the application as that allows the application to control the process.
Here are some modifications to make the inter-process communications work with Win32 from SciTE. The patch uses polling and can be attached to SciTE's on-idle polling code. This isn't optimal and should likely be replaced with IO completion ports or a separate thread for LSP communication. However, it works well enough for testing.
In LSP.patch, a PolIO call is made by the application and travels down through multiple layers to where the subprocess IO handles are available.
In ProcessWin.patch, the IO handles are either read or peeked as required by reading code.
I applied the patch to the code, and pushed some new work on gitlab.
The LSP code now doesn't modify the core Scintilla: work in Qt starts from scintilla/qt/ScintillaEditBase/ScintillaEditBase.h
The semantic highlighting is interesting but seems to go wrong sometimes, particularly on lines without any identifiers. The decoding appears OK, and the highlights are good the first time an example is pasted, so I think the most likely problem is that the server's copy of the text gets out-of-sync.
For lexers that support sub-styles like cpp and python, semantic highlighting could be integrated quite tidily by allocating a substyle of
identifierfor each of the scopes returned at startup as they are all types of identifier and are currently styled as identifier.Here is an example C++ file that exercises each of the scopes available from clang:
Thank you for the feedback!
I am working now to integrate LSP in our application, I will asap investigate the semantic highlighting.
I found some little bugs in Qt Scintilla implementation, would you like that I send some patches?
I found a signal name in qt/ScintillaEditBase/ScintillaEditBase.{h, cpp} that is the same as a function in qt/ScintillaEdit/ScintillaEdit.{h, cpp}
Marco Lazzarotto
Last edit: Marco Lazzarotto 2020-03-10
When there is a problem then open a new issue on the bug tracker and attach a patch if you have one.
While the name 'zoom' is the same, there is an argument to the signal so the signatures are different and it doesn't appear to cause a failure. Are you seeing problems because of this clash?
I had problems with connecting the signal to a slot, but specifying the right class (ScintillaEditBase instead of ScintillaEdit) all works…
I reverted the change 😊
Last edit: Marco Lazzarotto 2020-03-10
Good. Changing the name would have been a break in compatibility with client code.
The alternative of changing the ScintillaEdit::zoom call is also tricky as its automatically generated. There'd probably have to be an exceptions list of APIs that are exposed with different names (
SCI_GETZOOM->getZoominstead ofzoom).