Menu

#1330 Support for Language Server Protocol

Completed
open
nobody
scintilla (299)
5
2020-02-17
2019-12-09
No

In my company we use Scintilla as a text editor for source code and we want to add automatic completion and make the lexer do a more accurate job, for example to recognize user-defined variables and typedef and to correctly get context from included files.
I'm starting development and LSP seems to be 'the right' solution.
I know this can be done in the application by managing Scintilla notifications, however I am asking for collaboration if there are other people who want to integrate that support into Scintilla itself.
I don't know at the moment how to do it, perhaps as a new lexer or better as an extension of the actual lexer interface: I don't know the details of Scintilla's implementation well, and I need help in this field.

Discussion

1 2 > >> (Page 1 of 2)
  • Zufu Liu

    Zufu Liu - 2019-12-09

    There is a Scintilla based open source (GPL) IDE supports this https://wiki.codelite.org/pmwiki.php/Main/LanguageServer

     
    • Marco Lazzarotto

      It is not an option for us:

      1. The integration of codelite in our environment would be a nightmare (We use Qt for our development environment, codelite does use wxWidgets, moreover it is a far bigger project than we need)
      2. The version of Scintilla is very outdated (maybe it forked several years ago?)
      3. The license (GPL) is unfortunately not applicable to our case.

      My question is: are you (the Scintillla developers) interested in adding support for LSP to Scintilla? If yes, we can collaborate.

       

      Last edit: Marco Lazzarotto 2019-12-09
  • Neil Hodgson

    Neil Hodgson - 2019-12-09
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
     In my company we use Scintilla as a text editor for source code and we want to add automatic completion and make the lexer do a more accurate job, for example to recognize user-defined variables and typedef and to correctly get context from included files.
    -I'm starting development and [LSP] (https://langserver.org/) seems to be 'the right' solution.
    +I'm starting development and [LSP](https://langserver.org/) seems to be 'the right' solution.
     I know this can be done in the application by managing Scintilla notifications, however I am **asking for collaboration** if there are other people who want to integrate that support into Scintilla itself.
     I don't know at the moment how to do it, perhaps as a new lexer or better as an extension of the actual lexer interface: I don't know the details of Scintilla's implementation well, and I need help in this field.
    
     
  • Neil Hodgson

    Neil Hodgson - 2019-12-09

    Supporting LSP is a large feature and its likely that some of that support will have to go into applications. Its unlikely to be a good fit completely inside Scintilla. A companion library to Scintilla may be a better approach, with some corresponding work inside Scintilla. The lexer interface is designed just for lexing so could be used to convey LSP lexing functionality into Scintilla but that won't cover other LSP features.

    Scintilla's lexing functions are currently being separated out into a new Lexilla library.

     
  • Zufu Liu

    Zufu Liu - 2019-12-10

    For coloring, I have following idea:
    let application to register a hook to classify identifier at specific position. application can use it's symbol table/manager/database or what's ever (clangd, libclang, ctags, etc.) to lookup the kind of identifier (class, interface, structure, union, function, enumeration, enumerator, constant, macro, argument, class/instance field, local variable, etc.).
    This enable better coloring for scoped identifiers (argument, local variable, etc.), only small changes in Scintilla.

    struct Sci_ClassifyIdentifierRequest {
        Sci_NotifyHeader nmhdr;
        Sci_Position currentLine; // can be omitted
        Sci_Position startPos;
        Sci_Position endPos;
    };
    
    // return a style value in [0, 0xff] for identifier in range [startPos, endPos),
    // return negative value if application can't classify the identifier.
    // if returned style is not predefined by current lexer (in Scintilla.iface),
    // it's suggested to use large unused value to reduce conflicts with further evolution of the lexer.
    int (SCI_METHOD *Sci_ClassifyIdentifierHook)(Sci_ClassifyIdentifierRequest *request);
    
    // changes for LexCPP:
    case SCE_C_IDENTIFIER:
        int subStyle = classifierIdentifiers.ValueFor(s);
        if (subStyle < 0) {
            subStyle = pAccess->ClassifyIdentifier(sc.currentLine, styler.GetStartSegment(), sc.currentPos);
        }
        if (subStyle >= 0) {
            sc.ChangeState(subStyle|activitySet);
        }
    
     
    • Marco Lazzarotto

      @zufuliu: This is what I meant :-)
      @nyamatongwe: I agree the LSP support is placed in a separate and optional library, the work inside Scintilla could be:
      1) Commands (SCI_LSP_INIT*) to define, start and initialize the LSP server.
      2) Commands (SCI_LSP_SET*) to setup the desired handling of informations from the server: for example, handle completion/documentation by scintilla itself, or as a notification to be handled by the application.
      3) Give the lexers an interface to improve coloring.

      To maintain editor speed and to handle the I/O with the server I suppose the LSP client should run in a separate thread and communicate with Scintilla using an asynchronous interface.
      Since Scintilla (as far as I know) does not have a message loop of his own, this is for me the biggest doubt on how to implement LSP inside Scintilla and not completely on the application level.

       
      • Neil Hodgson

        Neil Hodgson - 2019-12-11

        Since the application owns the event loop, it will be simpler to start implementation in a specific application then move functionality into Scintilla as it is understood. The scope of this feature is not defined and is difficult to define without experience.

         
    • Neil Hodgson

      Neil Hodgson - 2019-12-11

      Providing direct callbacks from lexing makes it more difficult to move lexing onto another thread. One of my long-term goals is to isolate lexing into a restrictive environment which can be easily run in a separate thread or process.

      With LSP, its likely that classifying an identifier will require a quite slow cross-process call. Another approach may be to perform lexing as a streaming operation with multiple stages. The basic lexer stage determines most syntax including where identifiers occur, then a second stage patches identifiers into their classes with a batched LSP call.

       
      • Marco Lazzarotto

        My primary goal is to get autocompletion, the other important one is to improve syntax highlight.
        I am going to implement the basic protocol to let the server know about the document edited state, after that I hope to have more knowledge on the problems and possible solutions.

        I would like to change the lexer code as little as possible, actually the strategy I have in mind is:

        1. When the lexer hits an identifier that it does not know, it calls some function to queue a request to LSP
        2. When the LSP handles the request and get a positive answer it calls some function of the lexer that chooses the right style and apply it to the document.
         

        Last edit: Marco Lazzarotto 2019-12-11
  • Zufu Liu

    Zufu Liu - 2019-12-10

    For asynchronous coloring, it possible can be implemented by adding a notification with document range been styled after Scintilla's lexer finished coloring (in LexInterface::Colourise). After receiving this notification, application can query it's LSP, symbol database, etc. to find out identifier informations in this range, then call a new API to change each identifier's style.

    // notify application that document in [start, end] has finished coloring.
    void NotifyLexerFinished(Sci_Position start, Sci_Position end).
    // new API to change styles in range to specific style.
    void SetStyle(Sci_CharacterRange *range, int style).
    
     
    • Neil Hodgson

      Neil Hodgson - 2019-12-11

      Scintilla's single 'layer' of styling could be generalised into multiple layers each with an independent endStyled and corresponding styling update mechanism for each layer.

      This would be useful, for example, to perform spell check highlighting (and other linting tasks) over the document. Some layers may be dependent on others with, for example, spell checking having different rules for identifiers and comments.

      One way of handling post-lexer identifier classification is to change the style bytes for the identifier ranges, another technique is to add indicators over the ranges.

       
  • Zufu Liu

    Zufu Liu - 2019-12-11

    Based on LSP's Document Symbols Request, there only need one call to get all symbol info in current document.
    The procedures might be:

    1. application start call LSP server to get all symbol info before/after loading the document into Scintilla
    2. In SCN_MODIFIED notification, application start call LSP again to update symbols in current document
    3. after application get symbols from LSP server, it wait base lexer (e.g. LexCPP) to finish lexing, or find whether there are existing ranges that already lexed. for these finished ranges, application patch symbols in these ranges to their expected style by call SetStyle.
    4. in LexerCPP::Lex() after call sc.Complete(), call NotifyLexerFinished(startPos, startPos + length) to notify application that current range has finished lexing, application record/merge this range in it's notification handle, then apply style patches if applicable. Because of possible backtracking, NotifyLexerFinished need be called insider individual lexer (can be improved by keep initially position for LexAccessor::StartAt, then move the call into StyleContext::Complete and LexerSimple::Lex, or add a method in LexAccessor).
     
  • Marco Lazzarotto

    I am working on it with some interruptions due to other work, in the days after Christmas I would like to publish some preliminary work.
    Where is the best place to do it?

     
    • Neil Hodgson

      Neil Hodgson - 2019-12-23

      It may be possible to 'add attachment' on this issue with an archive or patch of the changes but you may not be permissioned for that. Try it and see.

      You could publish a patch on a web site or source code control server somewhere and show its URL here.

       
      • Marco Lazzarotto

        Here it is:
        https://gitlab.com/marcolazzarotto/scintilla-lsp
        I am used to program with Qt, there is some work to do on other platforms to implement the class Process (launch an external process, and communicate with it through stdin/stdout)

         
  • Neil Hodgson

    Neil Hodgson - 2020-01-02

    Thanks for the implementation - it is quite interesting.

    The automatic completion behaves reasonably when there is good contextual information like a visible struct definition when typing '"." after a variable name.

    I suppose the use of annotations for warnings is just a demonstration and this will eventually flow through to the application. Its a bit noisy when typing code, particularly the 3 line warnings. There also appears to be a limit of 20 warnings but I couldn't find a literal 20 in the code so don't know where that is coming from. For large documents, there could be performance issues with many warnings so there may be a need to have the application decide how to pace requests.
    Annotations

    SC_MOD_INSERTTEXT occurs in many contexts and will occur multiple times when there are multiple selections. The SCN_CHARADDED notification is more commonly used for automatic completion.

    It would help here to send SCNotification messages (including SCN_CHARADDED) to the LSP code instead of using DocWatcher. This would also help isolate the LSP code from Scintilla. I feel that the implementation is currently too tightly enmeshed inside Scintilla and a looser coupling would be better.

     
    • Marco Lazzarotto

      Yes, I used annotations just to see the results. The server configuration can be improved to not have 3 lines of code (capabilities/textDocument/publishDiagnostics/relatedInformation = true) and visualization should be delegated to application.
      I will also improve the way LSP server will be configured.

      I added the support for signature help and plan to add support for textDocument/semanticHighlighting : if well supported from the server this could make a lexer almost useless.

      To decouple LSP from Scintilla I imagine to write a client function to be called from NotifyParent() like this:

      void ScintillaQt::NotifyParent(SCNotification scn)
      {
          [...]
          lspClient->onNotification(scn);
      }
      

      I will do it in the next days.

       
      • Neil Hodgson

        Neil Hodgson - 2020-01-08

        I think it would be better to call onNotification from the application as that allows the application to control the process.

         
  • Neil Hodgson

    Neil Hodgson - 2020-01-24

    Here are some modifications to make the inter-process communications work with Win32 from SciTE. The patch uses polling and can be attached to SciTE's on-idle polling code. This isn't optimal and should likely be replaced with IO completion ports or a separate thread for LSP communication. However, it works well enough for testing.

    In LSP.patch, a PolIO call is made by the application and travels down through multiple layers to where the subprocess IO handles are available.

    In ProcessWin.patch, the IO handles are either read or peeked as required by reading code.

     
    • Marco Lazzarotto

      I applied the patch to the code, and pushed some new work on gitlab.
      The LSP code now doesn't modify the core Scintilla: work in Qt starts from scintilla/qt/ScintillaEditBase/ScintillaEditBase.h

       
      • Neil Hodgson

        Neil Hodgson - 2020-02-08

        The semantic highlighting is interesting but seems to go wrong sometimes, particularly on lines without any identifiers. The decoding appears OK, and the highlights are good the first time an example is pasted, so I think the most likely problem is that the server's copy of the text gets out-of-sync.

        For lexers that support sub-styles like cpp and python, semantic highlighting could be integrated quite tidily by allocating a substyle of identifier for each of the scopes returned at startup as they are all types of identifier and are currently styled as identifier.

        Here is an example C++ file that exercises each of the scopes available from clang:

        #include <stdio.h>
        
        namespace Ted {
        
        enum class Destiny { one, two };
        
        }
        
        template <class TyleEntity>
        class Tyle {
                TyleEntity unity;
                TyleEntity Returnee() {
                        return unity;
                }
        };
        
        int f() {
                Ted::Destiny ddd = Ted::Destiny::one;
                Tyle<char> tr3;
        
                // Read from x.txt
                FILE *fp = fopen("x.txt", "rb");
                char x[100];
                fread(x, 100, 1, fp);
                fclose(fp);
                return 0;
        }
        
         
        • Marco Lazzarotto

          Thank you for the feedback!

          I am working now to integrate LSP in our application, I will asap investigate the semantic highlighting.
          I found some little bugs in Qt Scintilla implementation, would you like that I send some patches?

          I found a signal name in qt/ScintillaEditBase/ScintillaEditBase.{h, cpp} that is the same as a function in qt/ScintillaEdit/ScintillaEdit.{h, cpp}

          Marco Lazzarotto

           

          Last edit: Marco Lazzarotto 2020-03-10
          • Neil Hodgson

            Neil Hodgson - 2020-02-12

            I found some little bugs in Qt Scintilla implementation, would you like that I send some patches?

            When there is a problem then open a new issue on the bug tracker and attach a patch if you have one.

            I found a signal name in qt/ScintillaEditBase/ScintillaEditBase.{h, cpp} that is the same as a function in qt/ScintillaEdit/ScintillaEdit.{h, cpp}

            While the name 'zoom' is the same, there is an argument to the signal so the signatures are different and it doesn't appear to cause a failure. Are you seeing problems because of this clash?

             
            • Marco Lazzarotto

              I had problems with connecting the signal to a slot, but specifying the right class (ScintillaEditBase instead of ScintillaEdit) all works…
              I reverted the change 😊

               

              Last edit: Marco Lazzarotto 2020-03-10
              • Neil Hodgson

                Neil Hodgson - 2020-02-14

                I had problems with connecting the signal to a slot, but specifying the right class (ScintillaEditBase instead of ScintillaEdit) all works…

                Good. Changing the name would have been a break in compatibility with client code.

                The alternative of changing the ScintillaEdit::zoom call is also tricky as its automatically generated. There'd probably have to be an exceptions list of APIs that are exposed with different names (SCI_GETZOOM -> getZoom instead of zoom).

                 
1 2 > >> (Page 1 of 2)

Log in to post a comment.

MongoDB Logo MongoDB