Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.
I have several LaTeX tags that I'm transforming. This is working fine, but the speed is really suffering the more tags I add.
For example, I set up one production grammar for index tags and one for heading tags. Then, I open a file, join it together as a single string for the contents to parse, and do something like this:
s = indextag.transformString(contents)
s = headings.transformString(s)
etc. etc. for all the tag types for which I've written a production.
For a file of 5,000 lines it's taking 1.25 minutes.
Am I crazy for doing it this way? Should I chain the productions together like
generictag = indextag | heading
and call transformString on generictag?
thanks, hope this makes sense.
Yes, definitely combine these tags into a single expression, and call transformString only once.
Next, you should look for ways to optimize the alternative matches. Do they all start with backslash? Then you can shortcircuit the parser checking using FollowedBy:
BSLASH = Literal("\\")
tagA = BSLASH + "a"
tagB = BSLASH + "b"
anyTag = tagA | tagB
betterAnyTag = FollowedBy("\\") + ( tagA | tagB )
evenBetterAnyTag = BSLASH + oneOf("a b")