Version 1.5.6 - June, 2011
- Cleanup of parse action normalizing code, to be more version-tolerant,
and robust in the face of future Python versions - much thanks to
Raymond Hettinger for this rewrite!
- Removal of exception cacheing, addressing a memory leak condition
in Python 3. Thanks to Michael Droettboom and the Cape Town PUG for
their analysis and work on this problem!... read more
Another pyparsing release, this time with many enhancements to parse
actions. The major items are:
- simplified parse action interface; parse actions no longer must
take all three arguments consisting of the original parsed string,
the parsed location, and the parsed tokens; parse actions can now
be defined with simplified argument interfaces:
. no arguments
. just the parsed tokens
. just the parse location and parsed tokens
- REMOVED SUPPORT FOR PARSE ACTIONS THAT RETURN LOCATION AND TOKENS;
or looking at this another way, added support for parse actions to
return tuples; parse actions that previously returned loc,tokens
will now be interpreted to return the tuple (loc, tokens); this
impending change was announced over 2 years ago, with explicit
deprecation warnings in the previous release
- new troubleshooting helper decorator, traceParseAction
- new parse action helper class OnlyOnce, for parse actions that
should only be called one time; subsequent invocations of an
OnlyOnce-wrapped parse action will raise a ParseException
- new setFailAction, to attach a method to an expression to be called
when the expression is tried and fails (sort of an anti-parse
- fixed the attachment of multiple parse actions, by breaking out the
attempt at mind-reading in setParseAction; setParseAction now
reverts to its previous behavior, and addParseAction appends
new functions to the expression's list of parse actions
- some new examples:
. list string parser (reconstitutes a Python list from a string
representation), including lists that contain elements that are
lists, tuples, ints, reals, or quoted strings
. line number demonstration, using the pyparsing line, lineno and
. listAllMatches example
. line break remover, for removing hard line breaks in word-wrapped
paragraphs with blank lines between paragraphs... read more
The project home page for pyparsing has been converted to a Wiki, hosted at http://pyparsing.wikispaces.com/
Please come and visit the new pyparsing project site, and add your contributions to the public Tips and Documentation pages!
Thanks to everyone who has sent in e-mails and suggestions for enhancements
to pyparsing. Most notable is Chris Lesniewski-Laas' submission of a
"packrat" or memoizing performance enhancement. I've shipped this version
with packrat mode disabled by default, since it may have adverse effects on
some parsers that include parse actions. It is easily enabled by calling
ParserElement.enablePackrat() after importing pyparsing.... read more
I've posted the S5 HTML files and supporting source code - you can download it at
I know it's only been about 3 weeks since 1.4 was released, but I introduced
some minor but annoying "enhancements" - most significantly, I used a Python
2.4-only generator expression, which broke Python 2.3 compatibility.
This minor release also gives me a chance to quick turnaround some
suggestions/requests from early downloaders of 1.4b1 and 1.4.
Thanks again to everyone for their suggestions and feedback.... read more
A detailed introduction to Pyparsing has been published at O'Reilly's ONLamp Python Developers web page. (http://www.onlamp.com/pub/a/python/2006/01/26/pyparsing.html)
The article includes several working examples, along with step-by-step descriptions of the programs.
Thanks to all those who ran the beta version of 1.4 through its paces! During the beta we:
- caught the "warning" vs. "warnings" typo on the improved error-checking code
- beefed up the exception trapping when using regexp's
- added the new QuotedString class, for defining custom quoted string definitions
Thanks to all who took time to download and test the latest version of pyparsing!... read more
I've uploaded version 1.4 beta 1 of pyparsing to SourceForge, it represents some significant reimplementation of some of the core parsing classes. The major change is the conversion to use internally generated regular expressions, plus the addition of the Regex class for user-defined re's. My performance tests show a 30-40% improvement in parsing speed.
For those looking for additional performance enhancements, the oneOf helper method has been especially enhanced to generate a regular expression, instead of the previous list of Literals within a MatchFirst class. oneOf is especially useful for taking advantage of performance gains, since it is compatible with version 1.3.x programs. The Word class has also undergone similar enhancement, as have several built-ins for comments and quoted strings.... read more
I submitted two abstracts to PyCon (since it's practically in my back yard this year), hoping to get one accepted - and the committee accepted them both! Here are the submission abstracts:
1. Introduction to Pyparsing: An Object-oriented Easy-to-Use Toolkit for Building Recursive Descent Parsers
Intended audience: beginning/intermediate Python programmers
pyparsing is a pure-Python module, containing a class library for easily
creating recursive-descent parsers. pyparsing's syntax provides tools for
both simple tokenization and data structuring and interpretation. I will
give an overview of the basic features of pyparsing, and a *very quick*
overview of the advanced features. I will close with 3 or 4 application
examples, time-permitting. (For more detail on the type of information I
have to present on pyparsing, you can visit my SourceForge project web page at http://pyparsing.sourceforge.net.\) ... read more
pyparsing is a 100% pure Python parsing package for creating readable parse
engines, using a
library of Python classes with easy-to-understand class names, such as
Literal, Word, Group, OneOrMore, Optional, and so on, combined with
operators such as '+' (for 'and' or 'sequence'), '^' (for 'or' or
'alternation'), '|' (for 'greedy or', selecting first matching alternate),
etc. No separate grammar source file is required, and there is no
code-generation step.... read more
pyparsing, a Python module for parsing text using a context-free grammar, has been updated with the release of version 1.2 in June, 2004, and version 1.2.1 this past week.
pyparsing's approach to defining grammars differs from the conventional lex/yacc approach. No external file using BNF or regex-style syntax is required. pyparsing's parse grammars are defined right in the Python parse code itself, using a library of parsing construction classes to compose the grammar. pyparsing includes such classes as:
- Literal and CaselessLiteral
- Dict... read more
pyparsing, a Python module for parsing text using a context-free grammar, has just been updated with the release of version 1.1.2. With this release, pyparsing continues to stabilize and converge on a production-worthy package.
Version 1.1.2 releases no new API changes, and only one minor bug fix - the starting location reported by scanString now correctly reports the start of the found tokens, not the start of any leading whitespace.... read more
(This release corrects the bug introduced in version 1.1, and adds compatibility for Python 2.2.)
New features in 1.1:
- simplified parse actions - parse action functions now only need to return
modified parse tokens, not a tuple of location and tokens - the example code
has been updated to reflect the new style (Old style is deprecated, but
still supported for backward compatibility.)
- added validate() method to parse elements, to help identify improperly
recursive grammar definitions
- better str() output for parse elements, more similar to traditional BNF
notation... read more
As part of the fixing of a minor bug, and adding a performance enhancement, I introduced a serious bug in version 1.1. I have withdrawn this release, and will get a corrected 1.1.1 out shortly.
Stable production release of the pyparsing Python module:
- performance increased 30-40%
- added positional tokens StringStart, StringEnd, LineStart, and LineEnd
- added convenience built-in for commaSeparatedList (more robust than simply using string.split(",")
- fixed setup.py typo
- added examples for HTTP server log parsing, and comma separated list
- minor API change: delimitedList does not enclose returned tokens in a Group, this is now the responsibility of the caller; delimitedList with 'combine=True' includes delimiters in returned string, good for scoped variables (a.b.c or a::b::c) and directory paths (a/b/c).
Minor changes, additional performance speed-ups.
Also includes more Python standard packaging, using distutils - thanks Dave Kuhlman!
One more change to the API, changed the module from just plain "parsing" to "pyparsing", to reflect its Python linkage. Also contains an additional example program, demonstrating how to use the Dict class.
This minor update corrects a faux pas in the parsing API, and adds a performance speedup of 20-30% when parsing complex input strings.