Re: [Pyparsing] Incremental parsing with no gaps between parsed ranges?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Thanks for the quick response, Paul.

With a 10 MiB input file of which no top-level element is longer than ~10
kB, it takes about 5 GiB of memory and 5 minutes before parseString()
starts returning results. I tried enablePackrat() and memory usage is
somewhat higher but speed is not appreciably improved. Based on the
docstring, I wouldn't expect enablePackrat to make a big improvement, since
every lengthy block in the grammar I'm trying to parse is introduced with a
unique keyword, so I don't think there's much backtracking-and-reparsing.

So I think I'm going to have to do incremental parsing in order to get
reasonably fast feedback from the parser. Do you have any suggestions for
how to do this? I'm trying to figure out if there's a good way to do greedy
consumption of trailing (whitespace|comments) at the end of each valid
top-level element.

-Dan

On Mon, Oct 27, 2014 at 9:17 AM, <pt...@au...> wrote:

> Before you go too far down this path, try enabling packrat parsing, which
> should help both performance and memory footprint.
>
> Right after importing pyparsing, add this line:
>
>     ParserElement.enablePackrat()
>
>
> -- Paul
>
>
> ---- Dan Lenski <dl...@gm...> wrote:
> > I'm using PyParsing to parse some rather large text files with a C-like
> > format (braces and semicolons and all that). PyParsing works just great,
> > but it is slow and consumes a very large amount of memory due to the
> > size of my files.
> >
>