Re: [Pyparsing] Incremental parsing with no gaps between parsed ranges?
Brought to you by:
ptmcg
From: Daniel L. <dl...@gm...> - 2014-10-27 16:33:01
|
Thanks for the quick response, Paul. With a 10 MiB input file of which no top-level element is longer than ~10 kB, it takes about 5 GiB of memory and 5 minutes before parseString() starts returning results. I tried enablePackrat() and memory usage is somewhat higher but speed is not appreciably improved. Based on the docstring, I wouldn't expect enablePackrat to make a big improvement, since every lengthy block in the grammar I'm trying to parse is introduced with a unique keyword, so I don't think there's much backtracking-and-reparsing. So I think I'm going to have to do incremental parsing in order to get reasonably fast feedback from the parser. Do you have any suggestions for how to do this? I'm trying to figure out if there's a good way to do greedy consumption of trailing (whitespace|comments) at the end of each valid top-level element. -Dan On Mon, Oct 27, 2014 at 9:17 AM, <pt...@au...> wrote: > Before you go too far down this path, try enabling packrat parsing, which > should help both performance and memory footprint. > > Right after importing pyparsing, add this line: > > ParserElement.enablePackrat() > > > -- Paul > > > ---- Dan Lenski <dl...@gm...> wrote: > > I'm using PyParsing to parse some rather large text files with a C-like > > format (braces and semicolons and all that). PyParsing works just great, > > but it is slow and consumes a very large amount of memory due to the > > size of my files. > > > |