Re: [Pyparsing] better parse error reporting
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2008-04-09 18:55:08
|
Gre7g and Martin - Thanks for the running thread on this "parse error capturing" patch. I confess I've not yet had time this week to read it or fully assess it. I've tried for a long time to get more accurate error locations in pyparsing. The biggest problem I've had is in constructions like: expr1 = OneOrMore( expr2 ) + expr3 and the input text contains: valid_expr2 invalid_expr2 valid_expr2 valid_expr3 OneOrMore( expr2 ) succeeds after parsing the first valid_expr2, and moves forward to try to parse an expr3. Positioned at invalid_expr2, the parse of expr3 fails, and you get an exception like "expected expr3 at (locn of invalid_expr2)". Or what is more likely occuring in many of these grammars is something like this: expr1 = OneOrMore( expr2 + expr3 + expr4 + expr5 + expr6 + expr7 + expr8 + expr9 ) and the input text contains: valid_expr2 ... valid_expr8 invalid_expr9 and the error says "expected expr1 at (locn of valid_expr2)" and you get this puzzled "huh?" look on your face. There is some code in pyparsing that tries to find the furthest successful match, but the piecework nature the grammar (each object does its own parsing in more or less isolation) means that I'm limited into how much state I can pass up the chain, and in the case of a partially successful OneOrMore(expr1) parse, I can't pass *anything* up. The only alternative at the moment is a global "here is the furthest I've gotten so far" variable, which I suspect is the mechanism behind Gre7g's patch. When I get some time, it might be worth revisiting pyparsing's design to see if I can pass some form of state object from element to element, so that a more suitable parse error location could be captured in it. -- Paul |