Re: [Pyparsing] Better way than operatorPrecedence to parse aregexp-like grammar?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

You should start by trying to tune up your definition of term, since this
expression gets used a *lot* internally to the operatorPrecedence code.
Here are some comments/questions/suggestions on cleaning up term:
1. term includes 2 references to numericRange, why?
2. variable() returns an expression that is a MatchFirst of 10 different
characters.  Just have this method return Regex("[0-9#]"), it will evaluate
much faster.
3. numericRange tests for signedNumbers, and then for unsigned numbers.  But
your unsigned numbers would match the signedNumber expression, so the second
alternative test will never match.  Also, signedNumber the way you have
defined it would match a single "-" character, probably not desired.  Try
this:
signedNumber = Optional('-') + Word(nums)  # or even just Regex(r"-?\d+")
numericRange = ( 
    (lbrack + Literal('#') + (signedNumber | '*').setResultsName('min') +
Suppress(':') + (signedNumber | '*').setResultsName('max') + rbrack)
    )
(I also removed the Combine - you might want Group instead.)

4. I streamlined repetition a bit from:

repetition = (
    ( plus + lbrace + Word(nums).setResultsName("count") + rbrace ) |
    ( plus + lbrace + Word(nums).setResultsName("minCount")+","+
Word(nums).setResultsName("maxCount") + rbrace ) |
    plus 
    )

to:
repetition = plus + Optional( lbrace + 
    ( ( Word(nums).setResultsName("minCount")+","+
Word(nums).setResultsName("maxCount") ) |
      Word(nums).setResultsName("count") )
    + rbrace 
    )

Which could look a little nicer as:
repetition = plus + Optional( lbrace + 
    ( ( Word(nums)("minCount")+","+ Word(nums)("maxCount") ) |
      Word(nums)("count") )
    + rbrace 
    )
(runs no faster, but I find it a little easier to read).  Since repetition
is the first precedence level, it gets used a lot, so any streamlining here
helps.

5. 		space = OneOrMore(White())
Really?  I doubt you are matching any these at the moment, since you aren't
taking any steps to disable pyparsing's default behavior of skipping
whitespace.  But as the first alternative in the list of expressions in
term, you are testing for it *many* times.

6. You might be able to reorder the options in term based on the likelihood
of occurrence in the input text.  Since this is a MatchFirst, testing for
more common options ahead of rarer ones will shortcut the rest of the tests,
with a performance win.

You might also define:
	integer = Word(nums)
And then use integer in all your related expressions, instead of repeating
Word(nums) all the time - this will make your code a little easier to read,
and the packratting will be a little more efficient, too.

I also have some comments on operatorPrecedence itself, but I'll wait until
you have gotten term to run a bit better before delving into oP.  Just one
note, instead of (in your list of precedence definitions):
			(Empty(), 2, opAssoc.LEFT, self.handleSequence),
Try:
			(None, 2, opAssoc.LEFT, self.handleSequence),

-- Paul