#1555 Bugs in REBOL lexer code

Bug
open
nobody
None
5
2013-12-01
2013-11-21
Cyphre
No

I have found several bugs/issues when using the REBOL script colorizing:

  1. the colorizing fails when the script doesn't contain REBOL header (which is valid type of script). The lexer should allow to colorize script without the REBOL header.
  2. the square bracket colorizing fails at following char sequence: [#" "] . The #" " is valid notation for space char! datattype in REBOL though
  3. the square bracket colorizing fails at following char sequences:
    [#"^^"]
    and
    ["^^"]
    Both are valid REBOL expressions. The first one is escaped ^ char! datatype. The second one is escaped ^ character inside string.

Below I'm attaching REBOL lexer file which contains fixes for all three bugs/issues mentioned above. Feel free to review/include this in next Scintila releases.

1 Attachments

Discussion

  • Neil Hodgson

    Neil Hodgson - 2013-11-22

    This documentation says that "every script must have a header":
    http://www.rebol.com/docs/core23/rebolcore-5.html#section-2
    The point here appears to be that there can be free-form preface documentation before the script starts and this is highlighted in SCE_REBOL_PREFACE. The change appears to remove the whole purpose of the SCE_REBOL_PREFACE style.

     
  • Cyphre

    Cyphre - 2013-11-22

    This documentation says that "every script must have a header"

    Yes, that's true for scripts that are executed directly. But it is quite common in real world that since the "REBOL code" can behave also as data, some REBOL files are just loaded without the header. As an example see this file https://github.com/rebolsource/rebol-test/blob/master/core-tests.r
    In this case we had to add the "REBOL []" string as commented line at the beginning of the file to trigger the Scintilla lexer at the moment otherwise syntax highliting won't be detected.
    I've discussed this also with other REBOL programmers and they all think making the REBOL header optional is a good choice.
    I also think this change in fact doesn't harm anything. It just guarantee that every *.r file will be colorized properly.

     
  • Cyphre

    Cyphre - 2013-11-22

    The change appears to remove the whole purpose of the SCE_REBOL_PREFACE style.

    I've been thinking this way as well but then I realized one case where the SCE_REBOL_PREFACE can be useful. That's when the REBOL script is executed in CGI mode on server.
    In such case the SCE_REBOL_PREFACE style can be used for colorizing the shebang line like:

    1
    2
    3
    4
    5
    6
    #!/usr/local/bin/rebol
    REBOL [
        title: "my CGI script"
    ]
    
    print "hello world"
    
     
    Last edit: Cyphre 2013-11-26
  • Cyphre

    Cyphre - 2013-11-22

    One more update regarding REBOL lexer issues. I just found one more problem. This sequence is colorized incorrectly:
    [%filename]

    Since it looks there is no way how to edit the bug text or attachments I'm adding new version of the LexRebol.cxx which fixes the problem(including the fixes above as well) here with this message.
    Please let me know in case it is better for you to create new separate bug for this one.

     
  • Pascal Hurni

    Pascal Hurni - 2013-11-25

    Hi folks,

    Thanx for enhancing this lexer, I'm okay with all your changes. Maybe could you update the history in the header so that we have a quick way to check the lexer revision.

    @Neil: Thanx for the ping, and happy commiting.

    Regards

     
  • Cyphre

    Cyphre - 2013-11-26

    Hi Pascal, thanks for checking here.
    I have reviewed the SCE_REBOL_PREFACE feautre though and Neil is right here. With current changes the preface is not properly detected.
    What we need to make it working is to add one more pass in the lexer where we can detect if the script contains REBOL header (or not) before start doing any colorizing.
    My idea was to add one more FOR loop with separate StyleContext instance using which the REBOL header is detected. Something like:

        if (startPos == 0) {
            bool hasHeader = false;
    
            StyleContext scPre(startPos, length, initStyle, styler);
            for (; scPre.More(); scPre.Forward()) {
                if (scPre.MatchIgnoreCase("rebol"))
                {
                    int i;
                    for (i=5; IsASpaceOrTab(styler.SafeGetCharAt(scPre.currentPos+i, 0)); i++);
                        if (scPre.GetRelative(i) == '[')
                        {
                            hasHeader = true;
                            break;
                        }
    
                    if (hasHeader) break;
                }
            }
            //set the initial state for the main StyleContext
            sc.SetState((hasHeader) ? SCE_REBOL_PREFACE : SCE_REBOL_DEFAULT);
        }
    
        //the main colorizing loop will folow from here
    

    I'm not so experienced in the Scintilla lexer api/features so I'd like to ask if you know about any better solution how to do that?

     
    Last edit: Cyphre 2013-11-26
    • Neil Hodgson

      Neil Hodgson - 2013-11-26

      If there are very large REBOL files then the scan could be slow.

      Another technique that could be used is for there to be a user settable property determining if the initial style should be SCE_REBOL_PREFACE or SCE_REBOL_DEFAULT. For an example of this, see the Python lexer and its code for properties like lexer.python.literals.binary.
      https://sourceforge.net/p/scintilla/code/ci/default/tree/lexers/LexPython.cxx

       
  • Neil Hodgson

    Neil Hodgson - 2013-12-01

    If its going to take time to implement all the changes, it may be an idea to bundle up the changes that are most certain (for the square brackets issues) into a patch that can be applied quickly. Then follow up with other changes later.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks