[Pyparsing] Get Better Error Messages - Prevent Backtracking
Brought to you by:
ptmcg
From: Eike W. <eik...@gm...> - 2008-04-13 23:16:40
|
I want to propose an other way, how to get better error messages: Prevent backtracking at certain points in the grammar. I have attached a simple example implementation that works with Pyparsing 1.4.5. Often it is clear that, when a certain parser fails, there is surely an error in the input. Backtracking in this situation is bad, because information about the location and the cause of the error might be lost. A common situation is after a keyword. Imagine a programming language, where you define variables like this: data a, b, c: Real; After the 'data' keyword there must be a list of variable names, a ':' character, the name of a type and a ';' character. If this pattern does not appear after the 'data' keyword there is an error in the program. (See lines 46, 47 in the example program.) A usual parser for this statement would look like this: dataDef1 = Group(Keyword('data') + attrNameList + ':' + identifier + ';') I propose the 'ErrStop' class, an additional 'ParserElement', that stops parsing when a parser given to it fails. It is used like this: dataDef2 = Group(Keyword('data') + ErrStop(attrNameList + ':' + identifier + ';')) In the example program the 'data' statement is combined with two additional statements, 'foo1;' and 'foo2;' to form a programming language. The program output shows that using the 'ErrStop' class really preserves more information about the error (missing comma after position 15). Additionally the parse result of a successful run is not altered. The program output: Test regular parser: [['foo1', ';'], ['data', ['a', ',', 'a1', ',', 'b'], ':', 'Real', ';'], ['foo1', ';']] Expected end of text (at char 6), (line:1, col:7) Test parser with backtracking stop: [['foo1', ';'], ['data', ['a', ',', 'a1', ',', 'b'], ':', 'Real', ';'], ['foo1', ';']] Expected ":" (at char 17), (line:1, col:18) |