Menu

Python parsing module / News: Recent posts

pyparsing 2.0.2 released!

(Oops, looks like I omitted the announcement of 2.0.1 - I'll include the notes from both in this announcement.)

pyparsing 2.0.2 just got pushed to SF and pyPI for immediate release. 2.0.1 was released last July, adding much improved compatibility for Python versions 2.6 thru 3.any. Here are the notes for both releases:

Version 2.0.2 - April, 2014
---------------------------
- Extended "expr(name)" shortcut (same as "expr.setResultsName(name)")
  to accept "expr()" as a shortcut for "expr.copy()".

- Added "locatedExpr(expr)" helper, to decorate any returned tokens
  with their location within the input string. Adds the results names
  locn_start and locn_end to the output parse results.

- Added "pprint()" method to ParseResults, to simplify troubleshooting
  and prettified output. Now instead of importing the pprint module
  and then writing "pprint.pprint(result)", you can just write
  "result.pprint()".  This method also accepts addtional positional and
  keyword arguments (such as indent, width, etc.), which get passed 
  through directly to the pprint method 
  (see http://docs.python.org/2/library/pprint.html#pprint.pprint).

- Removed deprecation warnings when using '<<' for Forward expression
  assignment. '<<=' is still preferred, but '<<' will be retained
  for cases whre '<<=' operator is not suitable (such as in defining
  lambda expressions).

- Expanded argument compatibility for classes and functions that
  take list arguments, to now accept generators as well.

- Extended list-like behavior of ParseResults, adding support for
  append and extend. NOTE: if you have existing applications using
  these names as results names, you will have to access them using
  dict-style syntax: res["append"] and res["extend"]

- ParseResults emulates the change in list vs. iterator semantics for
  methods like keys(), values(), and items(). Under Python 2.x, these
  methods will return lists, under Python 3.x, these methods will 
  return iterators.

- ParseResults now has a method haskeys() which returns True or False
  depending on whether any results names have been defined. This simplifies
  testing for the existence of results names under Python 3.x, which 
  returns keys() as an iterator, not a list.

- ParseResults now supports both list and dict semantics for pop().
  If passed no argument or an integer argument, it will use list semantics
  and pop tokens from the list of parsed tokens. If passed a non-integer
  argument (most likely a string), it will use dict semantics and 
  pop the corresponding value from any defined results names. A
  second default return value argument is supported, just as in 
  dict.pop().

- Fixed bug in markInputline, thanks for reporting this, Matt Grant!

- Cleaned up my unit test environment, now runs with Python 2.6 and 
  3.3.


Version 2.0.1 - July, 2013
--------------------------
- Removed use of "nonlocal" that prevented using this version of 
  pyparsing with Python 2.6 and 2.7. This will make it easier to 
  install for packages that depend on pyparsing, under Python 
  versions 2.6 and later. Those using older versions of Python
  will have to manually install pyparsing 1.5.7.

- Fixed implementation of <<= operator to return self; reported by
  Luc J. Bourhis, with patch fix by Mathias Mamsch - thanks, Luc
  and Mathias!


Version 2.0.0 - November, 2012
------------------------------
- Rather than release another combined Python 2.x/3.x release
  I've decided to start a new major version that is only 
  compatible with Python 3.x (and consequently Python 2.7 as
  well due to backporting of key features). This version will
  be the main development path from now on, with little follow-on
  development on the 1.5.x path.

- Operator '<<' is now deprecated, in favor of operator '<<=' for
  attaching parsing expressions to Forward() expressions. This is
  being done to address precedence of operations problems with '<<'.
  Operator '<<' will be removed in a future version of pyparsing.
Posted by Paul McGuire 2014-04-13

pyparsing 1.5.6 released!

Version 1.5.6 - June, 2011
----------------------------
- Cleanup of parse action normalizing code, to be more version-tolerant,
and robust in the face of future Python versions - much thanks to
Raymond Hettinger for this rewrite!

- Removal of exception cacheing, addressing a memory leak condition
in Python 3. Thanks to Michael Droettboom and the Cape Town PUG for
their analysis and work on this problem!... read more

Posted by Paul McGuire 2011-06-29

Pyparsing 1.4.3 Released!

Another pyparsing release, this time with many enhancements to parse
actions. The major items are:
- simplified parse action interface; parse actions no longer must
take all three arguments consisting of the original parsed string,
the parsed location, and the parsed tokens; parse actions can now
be defined with simplified argument interfaces:
. no arguments
. just the parsed tokens
. just the parse location and parsed tokens
- REMOVED SUPPORT FOR PARSE ACTIONS THAT RETURN LOCATION AND TOKENS;
or looking at this another way, added support for parse actions to
return tuples; parse actions that previously returned loc,tokens
will now be interpreted to return the tuple (loc, tokens); this
impending change was announced over 2 years ago, with explicit
deprecation warnings in the previous release
- new troubleshooting helper decorator, traceParseAction
- new parse action helper class OnlyOnce, for parse actions that
should only be called one time; subsequent invocations of an
OnlyOnce-wrapped parse action will raise a ParseException
- new setFailAction, to attach a method to an expression to be called
when the expression is tried and fails (sort of an anti-parse
action)
- fixed the attachment of multiple parse actions, by breaking out the
attempt at mind-reading in setParseAction; setParseAction now
reverts to its previous behavior, and addParseAction appends
new functions to the expression's list of parse actions
- some new examples:
. list string parser (reconstitutes a Python list from a string
representation), including lists that contain elements that are
lists, tuples, ints, reals, or quoted strings
. line number demonstration, using the pyparsing line, lineno and
col built-ins
. listAllMatches example
. line break remover, for removing hard line breaks in word-wrapped
paragraphs with blank lines between paragraphs... read more

Posted by Paul McGuire 2006-07-01

Pyparsing Gets New Home Page - Wiki

The project home page for pyparsing has been converted to a Wiki, hosted at http://pyparsing.wikispaces.com/

Please come and visit the new pyparsing project site, and add your contributions to the public Tips and Documentation pages!

Posted by Paul McGuire 2006-05-08

Pyparsing 1.4.2 Released

Thanks to everyone who has sent in e-mails and suggestions for enhancements
to pyparsing. Most notable is Chris Lesniewski-Laas' submission of a
"packrat" or memoizing performance enhancement. I've shipped this version
with packrat mode disabled by default, since it may have adverse effects on
some parsers that include parse actions. It is easily enabled by calling
ParserElement.enablePackrat() after importing pyparsing.... read more

Posted by Paul McGuire 2006-04-01

PyCon06 Pyparsing Presentations Prominently Posted

I've posted the S5 HTML files and supporting source code - you can download it at
http://www.geocities.com/ptmcg/python/index.html .

Enjoy!

Posted by Paul McGuire 2006-02-28

Pyparsing 1.4.1 released

I know it's only been about 3 weeks since 1.4 was released, but I introduced
some minor but annoying "enhancements" - most significantly, I used a Python
2.4-only generator expression, which broke Python 2.3 compatibility.

This minor release also gives me a chance to quick turnaround some
suggestions/requests from early downloaders of 1.4b1 and 1.4.

Thanks again to everyone for their suggestions and feedback.... read more

Posted by Paul McGuire 2006-02-06

Pyparsing Article Published at O'Reilly ONLamp

A detailed introduction to Pyparsing has been published at O'Reilly's ONLamp Python Developers web page. (http://www.onlamp.com/pub/a/python/2006/01/26/pyparsing.html)

The article includes several working examples, along with step-by-step descriptions of the programs.

Posted by Paul McGuire 2006-01-27

Pyparsing 1.4 released

Thanks to all those who ran the beta version of 1.4 through its paces! During the beta we:
- caught the "warning" vs. "warnings" typo on the improved error-checking code
- beefed up the exception trapping when using regexp's
- added the new QuotedString class, for defining custom quoted string definitions

Thanks to all who took time to download and test the latest version of pyparsing!... read more

Posted by Paul McGuire 2006-01-21

Pyparsing 1.4beta1 - testers wanted!

I've uploaded version 1.4 beta 1 of pyparsing to SourceForge, it represents some significant reimplementation of some of the core parsing classes. The major change is the conversion to use internally generated regular expressions, plus the addition of the Regex class for user-defined re's. My performance tests show a 30-40% improvement in parsing speed.

For those looking for additional performance enhancements, the oneOf helper method has been especially enhanced to generate a regular expression, instead of the previous list of Literals within a MatchFirst class. oneOf is especially useful for taking advantage of performance gains, since it is compatible with version 1.3.x programs. The Word class has also undergone similar enhancement, as have several built-ins for comments and quoted strings.... read more

Posted by Paul McGuire 2005-12-22

Pair of Pyparsing Papers Picked for Presentation at PyCon06

I submitted two abstracts to PyCon (since it's practically in my back yard this year), hoping to get one accepted - and the committee accepted them both! Here are the submission abstracts:

1. Introduction to Pyparsing: An Object-oriented Easy-to-Use Toolkit for Building Recursive Descent Parsers

Intended audience: beginning/intermediate Python programmers

pyparsing is a pure-Python module, containing a class library for easily
creating recursive-descent parsers. pyparsing's syntax provides tools for
both simple tokenization and data structuring and interpretation. I will
give an overview of the basic features of pyparsing, and a *very quick*
overview of the advanced features. I will close with 3 or 4 application
examples, time-permitting. (For more detail on the type of information I
have to present on pyparsing, you can visit my SourceForge project web page at http://pyparsing.sourceforge.net.\) ... read more

Posted by Paul McGuire 2005-12-15

pyparsing 1.2.2 released!

pyparsing is a 100% pure Python parsing package for creating readable parse
engines, using a
library of Python classes with easy-to-understand class names, such as
Literal, Word, Group, OneOrMore, Optional, and so on, combined with
operators such as '+' (for 'and' or 'sequence'), '^' (for 'or' or
'alternation'), '|' (for 'greedy or', selecting first matching alternate),
etc. No separate grammar source file is required, and there is no
code-generation step.... read more

Posted by Paul McGuire 2004-09-27

pyparsing 1.2 and 1.2.1 released!

pyparsing, a Python module for parsing text using a context-free grammar, has been updated with the release of version 1.2 in June, 2004, and version 1.2.1 this past week.

pyparsing's approach to defining grammars differs from the conventional lex/yacc approach. No external file using BNF or regex-style syntax is required. pyparsing's parse grammars are defined right in the Python parse code itself, using a library of parsing construction classes to compose the grammar. pyparsing includes such classes as:
- Literal and CaselessLiteral
- Word
- Group
- And
- Or
- MatchFirst
- NotAny
- Optional
- SkipTo
- ZeroOrMore
- OneOrMore
- Dict... read more

Posted by Paul McGuire 2004-08-26

pyparsing 1.1.2 released

pyparsing, a Python module for parsing text using a context-free grammar, has just been updated with the release of version 1.1.2. With this release, pyparsing continues to stabilize and converge on a production-worthy package.

Version 1.1.2 releases no new API changes, and only one minor bug fix - the starting location reported by scanString now correctly reports the start of the found tokens, not the start of any leading whitespace.... read more

Posted by Paul McGuire 2004-03-22

pyparsing 1.1.1 released

(This release corrects the bug introduced in version 1.1, and adds compatibility for Python 2.2.)

New features in 1.1:
- simplified parse actions - parse action functions now only need to return
modified parse tokens, not a tuple of location and tokens - the example code
has been updated to reflect the new style (Old style is deprecated, but
still supported for backward compatibility.)
- added validate() method to parse elements, to help identify improperly
recursive grammar definitions
- better str() output for parse elements, more similar to traditional BNF
notation... read more

Posted by Paul McGuire 2004-03-07

pyparsing 1.1 withdrawn

As part of the fixing of a minor bug, and adding a performance enhancement, I introduced a serious bug in version 1.1. I have withdrawn this release, and will get a corrected 1.1.1 out shortly.

Posted by Paul McGuire 2004-03-06

pyparsing 1.0.4 released

Stable production release of the pyparsing Python module:

- performance increased 30-40%
- added positional tokens StringStart, StringEnd, LineStart, and LineEnd
- added convenience built-in for commaSeparatedList (more robust than simply using string.split(",")
- fixed setup.py typo
- added examples for HTTP server log parsing, and comma separated list

- minor API change: delimitedList does not enclose returned tokens in a Group, this is now the responsibility of the caller; delimitedList with 'combine=True' includes delimiters in returned string, good for scoped variables (a.b.c or a:🅱️:c) and directory paths (a/b/c).

Posted by Paul McGuire 2004-01-08

pyparsing 1.0.3 released

Minor changes, additional performance speed-ups.

Also includes more Python standard packaging, using distutils - thanks Dave Kuhlman!

Posted by Paul McGuire 2003-12-24

pyparsing 1.0.2 Released

One more change to the API, changed the module from just plain "parsing" to "pyparsing", to reflect its Python linkage. Also contains an additional example program, demonstrating how to use the Dict class.

Posted by Paul McGuire 2003-12-19

pyParsing Python library - version 1.0.1 released

This minor update corrects a faux pas in the parsing API, and adds a performance speedup of 20-30% when parsing complex input strings.

Posted by Paul McGuire 2003-12-18