#1049 Add new lexer for DMIS

Committed
closed
5
2014-08-21
2014-04-18
StarFire
No

DMIS is a language for coordinate measuring machines (CMMs), see http://www.dmis.org

The attached patch implements lexing and a few folding points for DMIS. It also contains two (currently empty) wordlists for unsupported commands. These may or mat not be used from the manufacturers to indicate what they support and what they do not support (the DMIS standard is huge: about 600 pages; usually a manufacturer does not support all commands).

Also attached are all changed files. The patch is a diff to 5626e70cd337

1 Attachments

Discussion

  • StarFire

    StarFire - 2014-04-18

    Changed file 1: .hgignore

     
  • StarFire

    StarFire - 2014-04-18

    Changes files 2 & 3: SciLexer.iface, SciLexer.h

     
  • StarFire

    StarFire - 2014-04-18

    New file: LexDMIS.cxx

     
  • StarFire

    StarFire - 2014-04-18

    Changed file 4: Catalogue.cxx

     
  • StarFire

    StarFire - 2014-04-18

    Changed file 5: win32\scintilla.mak

     
  • Neil Hodgson

    Neil Hodgson - 2014-04-18

    There is no need to attach the files that are automatically generated as I'll run LexGen.py so just include the lexer source and the Scintilla.iface.

    Keyword lists should normally be externally supplied to the lexer instead of being hard-coded. This allows users to update for new versions of the standard without waiting for the lexer to be updated and to progress downstream to their application. They may also have local style rules to avoid certain keywords or are using an archaic version of the standard.

    There is a memory leak in DescribeWordListSets() which can be called by the application multiple times. Build the string as a member on the lexer object and delete in the destructor.

    There are a number of issues found by static checkers (try using cppcheck):
    scintilla\lexers\LexDMIS.cxx:206: style: The scope of the variable 'tmpStr' can be reduced.
    Only ever used in SCE_DMIS_KEYWORD when non-word character is seen so its wasteful to set tmpStr for every character.

    scintilla\lexers\LexDMIS.cxx:309: style: The scope of the variable 'style' can be reduced.
    scintilla\lexers\LexDMIS.cxx:310: style: The scope of the variable 'ch' can be reduced.
    scintilla\lexers\LexDMIS.cxx:311: style: The scope of the variable 'atEOL' can be reduced.
    scintilla\lexers\LexDMIS.cxx:312: style: The scope of the variable 'noFoldPos' can be reduced.

    In general, declare and initialize variables when needed instead of at the top of the function as this minimizes the chance they will be misused or stale.

    scintilla\lexers\LexDMIS.cxx:112: warning: Member variable 'LexerDMIS::m_isIFLine' is not initialized in the constructor.

    Actually m_isIFLine does not appear to be something that should be remembered between calls to the lexer so should not be a member of the lexer object. Do not assume that the lexer always moves forward as it will be called to relex lines that has been modified.

     
  • Neil Hodgson

    Neil Hodgson - 2014-04-18

    Also:
    ..\lexers\LexDMIS.cxx(178) : warning C4267: '+=' : conversion from 'size_t' to 'int', possible loss of data

    Just use size_t for totalLen or cast the call to strlen() to int.

     
  • StarFire

    StarFire - 2014-04-19

    Except the hardcoded wordlists, I have fixed all other issues (see attached files).

    Questions:
    - How do I supply the wordlists from the application (externally)? Do I use properties? Or something else?
    - What parameters do you use for cppcheck?

    The attached patch is diffed against a18b559a726d

     
    Last edit: StarFire 2014-04-19
  • Neil Hodgson

    Neil Hodgson - 2014-04-19

    Keywords can be set with http://www.scintilla.org/ScintillaDoc.html#SCI_SETKEYWORDS

    For cppcheck, the command line most used is
    cppcheck -j 8 --enable=all --suppressions scintilla/cppcheck.suppress --max-configs=100 -I scintilla/src -I scintilla/include -I scintilla/lexlib -I scintilla/qt/ScintillaEditBase --template=gcc --quiet scintilla
    There is a suppression file cppcheck.suppress in the scintilla root directory although that's for running against all of Scintilla which takes some time. I turn on all types of warning in cppcheck but only change the code in response if the code is better after the change. For example, cppcheck prefers initialization lists instead of initializing inside the constructor but I find long initializer lists ugly so only use initializer lists when there are few fields. I'll also run the cppcheck GUI with "Advanced | Show inconclusive errors" turned on but these are less likely to be useful.

    C strings need a 0 terminator and the length calculated in InitWordListSets doesn't take this into account so is 1 character too short.

    Dynamic memory allocation (new[]) is useful when the size may adapt to circumstances. The allocation of tmpStr is always for MAX_STR_LEN characters so, unless the plan is to change this to match the actual line length, it should just be a simple char array.
    char tmpStr[MAX_STR_LEN];

     
  • StarFire

    StarFire - 2014-04-22

    I have fixed the array that was too short and made a simple char array instead of dynamically allocating the memory.

    I have also implemented SCI_SETKEYWORDS, LexerDMIS::WordListSet respectively. I kept the hardcoded keywords list and use it as default; I hope this is acceptable.

    The patch is diffed against changeset dc4e1d1fff60

     
    • Neil Hodgson

      Neil Hodgson - 2014-04-24

      Visual Studio's Code Analysis doesn't believe the expression new char[++totalLen] allocates enough memory for the memset. Its simpler to just move the increment out of the expression one line earlier.

      I'd prefer not to include keyword lists inside lexers. This hasn't been done for any other lexer, it adds 4K to library size, having one mechanism means it is more likely to work, and including keyword lists may cause more code churn when the language gains more keywords.

       
  • StarFire

    StarFire - 2014-04-27

    I have removed the keyword lists from the lexer and also moved the increment on a line of its own

    The patch is diffed against 7484826e3e0a

     
  • Neil Hodgson

    Neil Hodgson - 2014-04-28

    Including your email address in the source code will publish it widely. It will be harvested leading to receipt of spam. So make sure you are OK with that address receiving more spam.

    The warning suppression pragmas should not be needed. 4706 doesn't occur and 4100 is easily avoided by removing parameter names when parameters are not used like LexCPP: PrivateCall(int, void ) instead of PrivateCall(int operation, void pointer).

     
  • StarFire

    StarFire - 2014-04-28

    I have removed the pragmas to disable the warnings and adapted the code instead. Concerning the e-mail address: it is in use since 1995 and usually I get about 1000 spam messages per day...

    The patch is diffed against 7484826e3e0a

     
  • Neil Hodgson

    Neil Hodgson - 2014-04-28
    • labels: --> scintilla, lexer
    • assigned_to: Neil Hodgson
    • Group: Completed --> Committed
     
  • Neil Hodgson

    Neil Hodgson - 2014-05-22
    • status: open --> closed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks