Re: [Pyparsing] parsing an operator-precedence-based grammar withreasonable performance?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Paul McGuire wrote:
> As Gre7g L already posted, enabling packrat parsing can be a big boost
> (~1000X) to performance when using operatorPrecedence.
> 
> I just uploaded an example of parsing RE's
> (http://pyparsing.wikispaces.com/space/showimage/invRegex.py) that I thought
> was already on the Pyparsing wiki, so I'm sorry for not posting this sooner.
> This particular example is an RE inverter, returning a generator of all
> strings that would match the given RE (note: does not allow arbitrary
> repetition operators such as '*' or '+', otherwise it would just blow up),
> but it also uses opPrec.  Parsing "(foo(bar))" takes about 1/4 of a second -
> still a long time, but I hope not ridiculously so.  Maybe this example might
> shed some light on some alternative approaches to this problem.
> 
That is what I used as a base for my parser (it needed a lot of 
modifcation, since this matching language is more like a hybrid of globs 
and regexps). It was already on the wiki.

Using a newer version of Python fixes the problem. I had actually tried 
enabled packrat parsing before, and it didn't seem to do anything. There 
seems to be some kind of bug that causes packrat parsing to break on 
Python 2.3 (it actually takes longer if it is enabled on 2.3). 2.4 and 
later don't seem to be affected.