Re: [Pyparsing] Speeding up a parse
Brought to you by:
ptmcg
From: <dav...@l-...> - 2008-09-24 23:28:14
|
One more thing, I also tried enablePackrat as well, but it had no discernable affect on the parse speed (but it did suck up a huge chunk of memory :) ) > _____________________________________________ > From: Weber, David C @ Link > Sent: Wednesday, September 24, 2008 5:55 PM > To: pyp...@li... > Subject: Speeding up a parse > > All, > > We've got a data file that we use for parsing "stuff". Presently, > this file is 80K lines long. Presently, this file takes about 3.3 > minutes to parse, which is an awfully long time to wait for something > like this. There are 122 rules for parsing this file, and > unfortunately the syntax of the data within is not very strict. This > leads to constructs such as: > > Interaction = \ > Keyword("(Interaction") + \ > INT_ID + \ > INT_Name + \ > INT_ISRType + \ > OneOrMore( > INT_MOMInteraction | > INT_Description | > INT_DeliveryCategory | > INT_MessageOrdering | > INT_RoutingSpace > ) + \ > ZeroOrMore(InteractionComponent) + \ > ")"; > > Where the intent of the OneOrMore section, is: > 1.) All are optional > 2.) They may appear in any order > > > I've also tried Each([Optional(...), Optional(...)]) without much > speedup success. > > I'm pretty sure that these constructs are causing a significant amount > of backtracking, but I'm not sure the best way to go about cleaning up > the grammar. > > > Also, I tried using psyco to speed up the parse, but I'm making use of > "keepOriginalText" option within the setParseAction() call, so that I > can get a copy of the original text within my parse action. This > seems to break psyco, based on one of the imports that is done. > > So two things: > > 1.) Any grammar speed up rules for the above? > 2.) Any ideas to get the orignal text, as well as make use of psyco? > > > Thanks > > --dw |