Thanks to everyone who has sent in e-mails and suggestions for enhancements
to pyparsing. Most notable is Chris Lesniewski-Laas' submission of a
"packrat" or memoizing performance enhancement. I've shipped this version
with packrat mode disabled by default, since it may have adverse effects on
some parsers that include parse actions. It is easily enabled by calling
ParserElement.enablePackrat() after importing pyparsing.
Other salient changes:
- new helper countedArray(expr)
- support added to the QuotedString class for asymmetric quoted strings
- ParserElements now support attachment of multiple parse actions
- some new examples:
. search query string parser
. Python's Grammar file EBNF parser
. sample of a configurable parser addressing various data formats
. example of a Roman numeral parser (and generator)
- added new docs directory, containing the text and examples from my
presentation at PyCon
Download the latest version of pyparsing at
Pyparsing is a pure-Python class library for quickly developing
recursive-descent parsers. Parser grammars are assembled directly in
the calling Python code, using classes such as Literal, Word,
OneOrMore, Optional, etc., combined with operators '+', '|', and '^'
for And, MatchFirst, and Or. No separate code-generation or external
files are required. Pyparsing can be used in many cases in place of
regular expressions, with shorter learning curve and greater
readability and maintainability. Pyparsing comes with a number of
parsing examples, including:
- "Hello, World!" (English and Korean)
- chemical formulas
- configuration file parser
- web page URL extractor
- 5-function arithmetic expression parser
- subset of CORBA IDL
- chess portable game notation
- simple SQL parser
- Mozilla calendar file parser
- EBNF parser/compiler
Here is the full change note for this release:
Version 1.4.2 - April, 2006
- Significant speedup from memoizing nested expressions (a technique
known as "packrat parsing"), thanks to Chris Lesniewski-Laas! Your
mileage may vary, but my Verilog parser almost doubled in speed to
over 600 lines/sec!
This speedup may break existing programs that use parse actions that
have side-effects. For this reason, packrat parsing is disabled when
you first import pyparsing. To activate the packrat feature, your
program must call the class method ParserElement.enablePackrat(). If
your program uses psyco to "compile as you go", you must call
enablePackrat before calling psyco.full(). If you do not do this,
Python will crash. For best results, call enablePackrat() immediately
after importing pyparsing.
- Added new helper method countedArray(expr), for defining patterns that
start with a leading integer to indicate the number of array elements,
followed by that many elements, matching the given expr parse
expression. For instance, this two-liner:
wordArray = countedArray(Word(alphas))
print wordArray.parseString("3 Practicality beats purity")
returns the parsed array of words:
['Practicality', 'beats', 'purity']
The leading token '3' is suppressed, although it is easily obtained
from the length of the returned array.
(Inspired by e-mail discussion with Ralf Vosseler.)
- Added support for attaching multiple parse actions to a single
ParserElement. (Suggested by Dan "Dang" Griffith - nice idea, Dan!)
- Added support for asymmetric quoting characters in the recently-added
QuotedString class. Now you can define your own quoted string syntax
like "<<This is a string in double angle brackets.>>". To define
this custom form of QuotedString, your code would define:
dblAngleQuotedString = QuotedString('<<',endQuoteChar='>>')
QuotedString also supports escaped quotes, escape character other
than '\', and multiline. (Submitted by Larry Maccherone - thanks!)
- Changed the default value returned internally by Optional, so that
None can be used as a default value. (Suggested by Steven Bethard -
I finally saw the light!)
- Added dump() method to ParseResults, to make it easier to list out
and diagnose values returned from calling parseString.
- A new example, a search query string parser, submitted by Steven
Mooij and Rudolph Froger - a very interesting application, thanks!
- Added an example that parses the BNF in Python's Grammar file, in
support of generating Python grammar documentation. (Suggested by
J H Stovall.)
- A new example, submitted by Tim Cera, of a flexible parser module,
using a simple config variable to adjust parsing for input formats
that have slight variations - thanks, Tim!
- Added an example for parsing Roman numerals, showing the capability
of parse actions to "compile" Roman numerals into their integer
values during parsing.
- Added a new docs directory, for additional documentation or help.
Currently, this includes the text and examples from my recent
presentation at PyCon.
- Fixed another typo in CaselessKeyword, thanks Stefan Behnel.
- Expanded oneOf to also accept tuples, not just lists. This really
should be sufficient...
- Added deprecation warnings when tuple is returned from a parse action.
Looking back, I see that I originally deprecated this feature in March,
2004, so I'm guessing people really shouldn't have been using this
feature - I'll drop it altogether in the next release, which will
allow users to return a tuple from a parse action (which is really
handy when trying to reconstuct tuples from a tuple string
Log in to post a comment.