Migrate from GitHub to SourceForge with this tool. Check out all of SourceForge's recent improvements.
Close

#932 Visual Prolog lexer

Completed
closed
Scintilla (358)
5
2012-06-01
2012-05-02
No

I have created a Visual Prolog (http://www.visual-prolog.com) lexer.

Discussion

  • Thomas Linder Puls

    Scintilla.iface will need a language number (120 should of course be the next in line):

    val SCLEX_VISUALPROLOG=120

    And some lexical states (I do not know what the "lex" line means, so the one I have written may be wrong):

    # Lexical states for SCLEX_VISUALPROLOG
    lex VisualProlog=SCLEX_VISUALPROLOG SCE_VISUALPROLOG_
    val SCE_VISUALPROLOG_DEFAULT=0
    val SCE_VISUALPROLOG_KEYMAJOR=1
    val SCE_VISUALPROLOG_KEYMINOR=2
    val SCE_VISUALPROLOG_KEYDIRECTIVE=3
    val SCE_VISUALPROLOG_COMMENT=4
    val SCE_VISUALPROLOG_COMMENT2=5
    val SCE_VISUALPROLOG_COMMENT3=6
    val SCE_VISUALPROLOG_COMMENTLINE=7
    val SCE_VISUALPROLOG_COMMENTKEY=8
    val SCE_VISUALPROLOG_COMMENTKEYERROR=9
    val SCE_VISUALPROLOG_IDENTIFIER=10
    val SCE_VISUALPROLOG_VARIABLE=11
    val SCE_VISUALPROLOG_ANONYMOUS=12
    val SCE_VISUALPROLOG_NUMBER=13
    val SCE_VISUALPROLOG_STRING=14
    val SCE_VISUALPROLOG_STRINGVERBATIM=15
    val SCE_VISUALPROLOG_STRINGEOLOPEN=16
    val SCE_VISUALPROLOG_CHARACTER=17
    val SCE_VISUALPROLOG_OPERATOR=18
    val SCE_VISUALPROLOG_STRINGESCAPE=19
    val SCE_VISUALPROLOG_STRINGESCAPEERROR=20
    val SCE_VISUALPROLOG_STRINGVERBATIMSPECIAL=21
    val SCE_VISUALPROLOG_STRINGVERBATIMEOL=22

     
  • Eric Promislow

    Eric Promislow - 2012-05-03

    Grrr. I had written a pile of notes that got lost to the browser textarea bitpile.

    Short form then:

    1. Why not a Prolog lexer with Visual Prolog extensions? From what I can gather, most of the differences
    would be in the set of keywords, and that's the responsibility of the host editor.

    2. I would use underscores to separate words in the state names. The older names in Scintilla.iface don't,
    but many of the newer ones do. UNSEPARATEDCAPITALIZEDWORDSARANNOYINGTOREAD.

    3. From looking at the VP web page, Verbatim strings have to end on the same line. If they do, then
    STRINGVERBATIMEOL should end at line-start. If they don't, there's no need for STRINGVERBATIMEOL

    4. I would rename ..._COMMENT to ..._COMMENTBLOCK to better differentiate it from COMMENTLINE.

    5. What's with the special case handling of "/*" in a comment-block? Is this to color instances of "/*" in
    block comments? Why bother? The C lexer doesn't. In any case, the reason for COMMENT2 and
    COMMENT3 isn't clear.

    6. Character recognition looks wrong. Shouldn't it require either a \ucccc \c or c, and complain if more
    than one character value is surrounded by single-quotes?

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-03

    There are some unnecessary elements: GetRestOfLine, After, and stylePrev are defined but not used.

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-03
    • assigned_to: nobody --> nyamatongwe
    • labels: --> Scintilla
     
  • Neil Hodgson

    Neil Hodgson - 2012-05-03

    The purpose of the 'lex' statement is to make it easier for applications to derive features like a style-setting dialog: the first identifier 'VisualProlog' is a more readable name to use in the UI, the second (SCLEX_VISUALPROLOG) ties this to a particular lexer and the third (SCE_VISUALPROLOG_) is a prefix used to find all the styles used by the lexer.

     
  • Thomas Linder Puls

    Thank you for looking into the details of this.

    I am not quite aware how to spot unused elements (using Visual Studio), but there is of course no reason to let unused elements float around. And thank you for the 'lex' explanaition. It has no influence on our own usage of this lexer.

    Regarding "ericp" issues:

    1. My concern is Visual Prolog, as we are going to use Scintilla in our IDE. I guess that by large this lexer can be used for ISO/Edinburgh Prolog as well if you refrain from defining any major and minor keywords at all.

    2. I have no objection in adding underscores; I just used the style that seem to be used in the file already.

    3. Verbatim string literals can span several lines, where the line shifts are treated literally and we would therefore like to change the backgound color of the line shifts.

    Perhaps you can point me to the place where have seen that they had to terminate on same line,, so that we can correct and/or clarify.

    4. It makes no difference to me whether is it _COMMENT or _COMMENTBLOCK or COMMENT_BLOCK or _BLOCK_COMMENT for that matter.

    5. Visual Prolog allows nesting of comments (which I actually also think that C/C++ does), but the C lexer does not handle this very well (i.e. at all). This lexer handles it correctly with up to two levels of nesting. We do not really need different styles, but we need different states in the lexer state machine to handle this.

    6. I guess you are right, but it is not really that important since characters are only used very rarely in Visual Prolog. Perhaps someday I will improve it in this respect.

     
  • Thomas Linder Puls

    By the way should I put the lists of major, minor and doc keywords anywhere?

    I find it a bit strange that keywords are not preset in the lexers; why should every one have to deal with that?

    Likewise I find it a bit stange that styles (i.e. token colers) are not preset.

    You can always choose to override if you think it should be different.

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-06

    Visual C++ will find some unused elements on warning level 4 which is the level used in the make files distributed with Scintilla.

    Another line that needs changing is that the destructor for LexerVisualProlog should be virtual. This was a recent change: http://scintilla.hg.sourceforge.net/hgweb/scintilla/scintilla/rev/8156d1e9e74b

    If the language is the same or quite similar syntactically as other Prolog implementations apart from keywords then it is more likely that it will receive attention, bug reports and fixes if it is aimed at all versions of Prolog and named Prolog instead of Visual Prolog. Specialization diffuses effort which is a contributor to why the C++/C/Java/C#/... lexer is so much better than the 6 lexers for Basic variants.

    Standard C/C++ does not allow nesting comments although some implementations do. The Lua lexer implements nested comments and strings using line state.

    Keywords commonly differ between uses of a lexer in several cases. Multiple languages often reuse a lexer. Different versions of a language add and remove keywords. Users may have different ideas of keywords: True and False are boolean literals in Python 2, not keywords, but users sometimes want them treated as keywords. While including a base set of keywords inside the lexer may be of some benefit, its likely to be inadequate for many users and increase churn and maintenance effort.

    Colours differ widely between applications and users. Symbolic data like defining that SCE_VISUALPROLOG_COMMENTLINE is a comment would help applications choose default colours in line with their themes but that idea didn't gain traction.

     
  • Thomas Linder Puls

    Thank you for the input. I will spend some time on updating according to your suggestions and then I will send the updated lexer.

    I was considering a line state to deal with the comment nesting, but I didn't think there was something in that direction. I will look at the Lua lexer to see how it is done there.

    I fully appreciate that people may want to use different colors and keywords (especially for their favorite languages). But can't see that it should prevent having (non-empty) defaults, so that a lexer can be used effortless in a default mode.

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-07

    If you want to push changes to embed colours and keywords in lexer source then propose that on the mailing list and try to find a consensus on the features and API.

    If there's a good implementation and some traction, I won't block it.

     
  • Thomas Linder Puls

    I have updated the Visual Prolog lexer following the suggestions here.

    I have also considered whether it is a good or bad idea to use it for ISO/Edinburg Prolog also. Some things are in common, but I belive that there are substantial/essential differences for single quotes (which are used for different purposes), string literals and comments. And therefore I think it is wiser to say that this lexer is for Visual Prolog, but that it may produce a relatively good result for sound ISO Prolog if colers are set suitable and keywords (which there are none of in ISO P´rolog) are "null".

     
  • Thomas Linder Puls

    Once again I forgot the scintilla.iface:

    # Lexical states for SCLEX_VISUALPROLOG
    lex VISUALPROLOG=SCLEX_VISUALPROLOG SCE_VISUALPROLOG_
    val SCE_VISUALPROLOG_DEFAULT=0
    val SCE_VISUALPROLOG_KEY_MAJOR=1
    val SCE_VISUALPROLOG_KEY_MINOR=2
    val SCE_VISUALPROLOG_KEY_DIRECTIVE=3
    val SCE_VISUALPROLOG_COMMENT_BLOCK=4
    val SCE_VISUALPROLOG_COMMENT_LINE=5
    val SCE_VISUALPROLOG_COMMENT_KEY=6
    val SCE_VISUALPROLOG_COMMENT_KEY_ERROR=7
    val SCE_VISUALPROLOG_IDENTIFIER=8
    val SCE_VISUALPROLOG_VARIABLE=9
    val SCE_VISUALPROLOG_ANONYMOUS=10
    val SCE_VISUALPROLOG_NUMBER=11
    val SCE_VISUALPROLOG_OPERATOR=12
    val SCE_VISUALPROLOG_CHARACTER=13
    val SCE_VISUALPROLOG_CHARACTER_TOO_MANY=14
    val SCE_VISUALPROLOG_CHARACTER_ESCAPE_ERROR=15
    val SCE_VISUALPROLOG_STRING=16
    val SCE_VISUALPROLOG_STRING_ESCAPE=17
    val SCE_VISUALPROLOG_STRING_ESCAPE_ERROR=18
    val SCE_VISUALPROLOG_STRING_EOL_OPEN=19
    val SCE_VISUALPROLOG_STRING_VERBATIM=20
    val SCE_VISUALPROLOG_STRING_VERBATIM_SPECIAL=21
    val SCE_VISUALPROLOG_STRING_VERBATIM_EOL=22

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-09

    The destructor ~LexerVisualProlog() should be virtual.

    The name of the lexer in the lex statement should be mixed case as in your first version "VisualProlog" as this is meant to be used for UI elements. For consistency with other lexers, the name in the LexerModule object should be all lower-case "visualprolog".

     
  • Thomas Linder Puls

    OK, I think I got it right now (the file is updated)

    lex VisualProlog=SCLEX_VISUALPROLOG SCE_VISUALPROLOG_

     
  • Neil Hodgson

    Neil Hodgson - 2012-05-09
    • milestone: --> Completed
     
  • Neil Hodgson

    Neil Hodgson - 2012-05-09

    Committed.

     
  • Thomas Linder Puls

    Excellent, thank you.

     
  • Neil Hodgson

    Neil Hodgson - 2012-06-01
    • status: open --> closed
     

Log in to post a comment.