Please add better error reporting when raising exceptions due to parse failures on ZeroOrMore, OneOrMore,
|, and ^.
The reason is that these all take alternatives and will only report the first production in their underlying expression
as the source of the error.
Example:
import pyparsing as pp
rule1 = pp.Literal('x').setName('x') | pp.Literal('y').setName('y')
rule2 = pp.Literal('x').setName('x') ^ pp.Literal('y').setName('y')
rule1.parseString('a')
rule2.parseString('a')
Executing the above rules will yield you
...expected x
whereas I would expect pyparsing to state
...expected one of x, y
Here is a sample extension to OneOrMore, ^, |, in order to clear things up
class OneOrMore(pp.OneOrMore):
"""Optional repetition of zero or more of the given expression.
This variant of OneOrMore will report all of the parse elements instead of
just the first on failure to parse the underlying expression.
Also raises ParseException if the underlying expression fails."""
def parseImpl(self, instring, loc, doActions = True):
# must be at least one
try:
loc, tokens = self.expr._parse(instring, loc, doActions, callPreParse = False)
except (pp.ParseException, IndexError) as e:
if isinstance(e, pp.ParseException):
if e.loc != 1:
raise e
expected = []
for expr in self.expr.exprs:
expected.append(str(expr))
raise pp.ParseException('expected one or more of %s.' % ','.join(expected))
raise e
try:
hasIgnoreExprs = (len(self.ignoreExprs) > 0)
while 1:
if hasIgnoreExprs:
preloc = self._skipIgnorables(instring, loc)
else:
preloc = loc
loc, tmptokens = self.expr._parse(instring, preloc, doActions)
if tmptokens or tmptokens.keys():
tokens += tmptokens
except (pp.ParseException, IndexError):
raise
return loc, tokens
In addition, ZeroOrMore will consume any parse errors, thus it will not provide
the user with the correct information, causing debugging hell. Here is a proposed
extension to ZeroOrMore that will report parse errors found.
class ZeroOrMore(pp.ZeroOrMore):
"""Optional repetition of zero or more of the given expression.
This variant of ZeroOrMore will report parse exceptions thrown by
the underlying expression."""
def parseImpl(self, instring, loc, doActions = True):
tokens = []
try:
loc, tokens = self.expr._parse(instring, loc, doActions, callPreParse = False)
hasIgnoreExprs = (len(self.ignoreExprs) > 0)
while 1:
if hasIgnoreExprs:
preloc = self._skipIgnorables(instring, loc)
else:
preloc = loc
loc, tmptokens = self.expr._parse(instring, preloc, doActions)
if tmptokens or tmptokens.keys():
tokens += tmptokens
except (pp.ParseException, IndexError):
raise
return loc, tokens
perhaps one could make this exception handling optional, by simply setReportParseErrors(True)?
TIA
Late last year I enhanced these error messages, are they closer to what you were looking for?