Sorry about that. I've put the patch up here:
On Mon, Mar 14, 2011 at 12:50 AM, Paul McGuire <ptmcg@...> wrote:
> This sounds like some terrific work, thanks! Unfortunately I got no
> attachment on your e-mail, could you paste it to someplace publicly
> accessible, maybe pastebin.com? I've got some other changes queued up for
> a next release, but this would be great to get included.
> Please write back when you've got your code posted.
> -- Paul
> -----Original Message-----
> From: Michael Droettboom [mailto:mdboom@...]
> Sent: Friday, March 11, 2011 1:11 PM
> To: pyparsing-users@...
> Subject: [Pyparsing] Patch to fix memory leaks with Python 3.x
> We are in the process of porting matplotlib to Python 3.x. Matplotlib uses
> pyparsing to parse a TeX-like mini-language for math expressions.
> A bunch of hard-working folks at the Cape Town PUG noticed that memory was
> leaking like crazy whenever this functionality was being used. On further
> investigation, it very confusingly turns out it was leaking stack frames, so
> even objects that never touched the pyparsing-based parser were getting
> This seems to be centered around the change in Python 3.x where exception
> objects contain a member "__traceback__" containing the full traceback of
> the exception. This means that an exception object that is referenced
> outside of an except block will create a cyclical reference with the local
> stack frame in which its in. For example, in code like:
> except Exception as exc:
> my_exc = exc # my_exc will live beyond the except block return my_exc
> This creates a cylical reference from my_exc -> my_exc.__traceback__ ->
> local stack frame -> my_exc.
> Having cyclical references means that any local variable *anywhere in the
> stack* of the thrown exception, will not be freed until the garbage
> collector feels enough pressure to do so. When those objects include
> C-extensions that allocate memory on the heap (as is the case in
> matplotlib), the garbage collector doesn't know enough about those objects
> to start freeing soon enough, and memory usage quickly grows unmanageable.
> See this warning in the "porting to Python 3" guide:
> See also the "Open Issue" section of PEP 3134:
> This causes a lot of headaches storing and passing around exceptions for
> later use as pyparsing does routinely.
> I have attached a patch against SVN that seems to resolve these reference
> leaks -- at least the ones that are exercised by matplotlib's math parser.
> The changes fall into a number of categories:
> 1) Remove use of sys.exc_info(). In Python 3, the exception object (that
> is the "exc" variable of "except Exception as exc") is automatically
> dereferenced upon leaving the except block. The same is not true of the
> result of sys.exc_info(), and if the exception object leaves the except
> block it requires special care to avoid creating a cyclical reference with
> the frame. It's not required, but it does simplify the code a lot to simply
> use "except Exception as exc" where it applies.
> 2) By storing the "myException" object in ParserElement objects, it was
> creating a cyclical reference between the exception object and the
> ParserElement object. This was not much of a problem in Python 2.x, but in
> Python 3.x since exception objects pull in all the baggage from the
> traceback, the memory wastage is considerable. I fixed this case by simply
> creating exception objects when they are raised, and not maintaining a
> myException member. I don't know why the myException member existed in the
> first place (performance considerations perhaps?), so I don't know if there
> are downsides to this change. An alternative might be to store a weak
> reference to the ParserElement inside of the exception object -- but that
> creates a user-visible API change to the exception object.
> 3) When exception objects do need to exist outside of the except block, the
> traceback should be removed from the exception object, using
> "exc.__traceback__ = None". There are a few examples of this, such as
> storing exceptions in the parser cache (in _parseCache). By deleting the
> traceback, it is basically restored to the behavior of the old Python 2.x
> code, which, by using sys.exc_info(), was storing the exception only and not
> the traceback payload.
> Thanks again for pyparsing -- it has been invaluable on our project. I
> hope this patch will benefit others making the transition to Python 3.
> Michael Droettboom