Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#584 D lexer update

Completed
closed
Neil Hodgson
Scintilla (355)
4
2009-07-03
2009-05-19
maXmo
No

Fixed nasty comment which was highlighted wrong by viewvc.
Added support for unicode chars in identifiers as per D spec.
Added 3 extra keyword groups.
Strings are multiline in D.
Slightly more careful number parsing: don't parse 0..2 as number, parse decimal and hex floats.
Support for two types of wysiwyg strings.
Some support for hex strings (no escape sequences).

Check if it compiles and works.
Example file: http://dsource.org/projects/phobos/browser/trunk/tools/rdmd.d
I'll post more spesialized testcase later.

Discussion

1 2 3 .. 6 > >> (Page 1 of 6)
  • Neil Hodgson
    Neil Hodgson
    2009-05-19

    Committed to CVS.

     
  • Neil Hodgson
    Neil Hodgson
    2009-05-19

    • milestone: --> 897169
    • priority: 5 --> 4
    • assigned_to: nobody --> nyamatongwe
     
  • Vincent Thorn
    Vincent Thorn
    2009-05-20

    Hi, I'm newbie to the project. Question: what a reason to add 3 more groups? Didn't you bother with existing 4? (plus 5 different strings, numbers, etc)

    Yesterday I rewrote this lexer completely, hope you'll find mine more usable.

     
  • maXmo
    maXmo
    2009-05-20

    Revision 1.3 supported only 3 usable keyword groups.
    I usually use different highlighting for statements, types and attributes (and D has a good set of attributes). Separate red style for casts and one for some platform types. Different strings are needed only for parser to know, what is the current context and how it can end.

     
  • maXmo
    maXmo
    2009-05-20

    I see no need for complete rewrite either.

     
  • Vincent Thorn
    Vincent Thorn
    2009-05-20

    Well, if you're happy just with "strings" and 6 sorts of keywords, OK (I have opposite prefs: 6 types of strings and not so big matter of keywords). Me was disappointed with ugly copy of C++ parser, converted to D. See my description I prepared for new lexer:

    What's done:
    1. All latest keywords, including spec.symbols (like __TIMESTAMP__, etc)
    2. Full support for normal, verbatim(WYSIWYG), hex-, delimited and token strings.
    3. Support(highlight) for one-char escape sequences inside normal strings.
    4. Limited support for numbers, including underscore and all prefixes/spec.chars; There is no semantic pass, so all mix of valid characters are allowed.
    5. All comments supported, escept nested /++/ - they cannot nest (I mean they are not highlighted properly if nested)

    What is in plans (in priority order):
    1. Full support for escape sequences, like a \&blah; \x0000 or \000
    2. Custom folding
    3. Nested comments
    4. Operators validation (like a == != >>>= !<>= etc)
    5. Number validation

     
  • Neil Hodgson
    Neil Hodgson
    2009-05-20

    • milestone: 897169 -->
     
  • Neil Hodgson
    Neil Hodgson
    2009-05-20

    Its really up to people that use D to work out what to do. For now, I have reverted to the previous version while this is being discussed.

     
  • Vincent Thorn
    Vincent Thorn
    2009-05-20

    OK, I downloaded latest maxxmo improvements and will try to merge with my code. Sure, you'll like new lexer! See example: http://i44.tinypic.com/29z4cok.png
    (sorry for colors, didn't ajust anything)

     
  • 1. Ridiculous. It's not lexer's job to know latest keywords, especially "spec symbols".
    2. q{} string was a misdesign and will be probably dropped as it clashes with delegate literals.
    4. I would like to see exponent parsing no worse than in my code (though it still has two minor bugs for this matter).
    5. It's a regression.

     
1 2 3 .. 6 > >> (Page 1 of 6)