[Pyparsing] More useful parse errors

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I am using pyparsing to parse CIM mof files.  It is a rather lengthy grammar
with lots of instances of OneOrMore.  The top level structure is a like this:

bnf = OneOrMore(rest_of_grammar) + StringEnd()

This raises an exception when there is a parse error, but the error is not
flagged at the point where the error occurs.  Since everyone here should be
familiar with Python grammar, I will provide an example of what would happen
if we had a pyparsing grammar for python and attempted to parse something with
a syntax error:

class blah(object):
    pass

class error_says_here(object):
    def __init__(self):
        self.a = 1
        self.b = 2

    def error_really_here(self:
        pass

In this case the error would flag as having occurred at error_says_here rather
than error_really_here.

Is there a way to work around this behavior?  I have not yet come across an
example of a grammar that looks like it does a really good job of error
handling.  Does the architecture of pyparsing make it easy to write a parser
for data that you know is syntacticly correct, but difficult or impossible to
write a parser that can give useful error reports to the user in the face of
syntax errors?

Traditional parsers would know all possible tokens that could be next and if
it encountered something that wasn't one of the above tokens, it would flag an
error at the location.  pyparsing in contrast says, will this parse, no, ok
will this parse, no, ok, will this parse?  The result is that it can throw
away large portions of valid syntax to eventually end immediately after parsing
the last full rest_of_grammar from our OneOrMore(rest_of_grammar) +
StringEnd() idiom.

It may be possible for pyparsing to precompute what the possible next tokens
are, which may have the side effect of making pyparsing more efficient since
it won't have so much trial and error.

-Chris