From: Michael D. <md...@st...> - 2007-07-09 14:48:11
|
I am about to update the memory leak question in the FAQ, but I thought I'd run it by the list first. I removed language that talked about much earlier releases of mpl, and the paragraph about leaks in older versions of Numeric and numarray. It seems like we should recommend numpy when the user experiences problems with Numeric or numarray, but I wanted to confirm that before adding a paragraph to that effect. The second question (below), can be ommitted -- it really belongs in a developer's FAQ. Any recommendations on a better place to post that information? <snip> matplotlib appears to be leaking memory, what should I do? First, determine if it is a true memory leak. Python allocates memory in pools, rather than one object at a time, which can make memory leaks difficult to diagnose. Memory usage may appear to increase rapidly with each iteration, but should eventually reach a steady state if no true memory leaks exist. (For more fine-grained memory usage reporting, you can build a custom Python with the --without-pymalloc flag, which disables pool allocation.) If after sufficient iterations, you still see that memory usage is increasing, it is a likely a bonafide memory leak that should be reported. memleak_gui.py <http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/matplotlib/unit/memleak_gui.py?view=markup> (in the unit directory of the source tree), contains an example script useful for diagnosing and reporting memory leaks. Please use something like it when reporting leaks so we get an idea of the magnitude of the problem (i.e. bytes per figure). Also please provide your platform, matplotlib version, backend and as much information about the versions of the libraries you are using: freetype, png and zlib. It would also help if you posted code so I could look for any obvious coding errors vis-a-vis matplotlib. There are some known memory leaks in matplotlib-0.90.1 when used in conjunction with the Tk, Gtk, Wx, Qt and Qt4 GUI backends. Many of these leaks have resolutions in the current SVN version of matplotlib. However, the following library versions are known to have leaks that matplotlib triggers. If you have one of these versions and are experiencing memory leaks, you should upgrade your library. This is not an exhaustive list. * *Wx backend:*wxPython-2.8.2 or earlier in the 2.8 series * *Gtk backend:*pygobject-2.12.x, pygtk-2.4.0 I'd like to help diagnose a memory leak, rather than just report it. How do you recommend I do that? I thought you'd never ask! Python memory leaks tend to fall into one of the following categories: * *Uncollectable garbage:* The Python garbage collector is not infallible. It can not collect objects that the Python that have __del__ methods or weakrefs and contain cycles. If curious, you can read about this problem in horrific detail <http://svn.python.org/view/python/trunk/Modules/gc_weakref.txt?view=markup>. You can obtain a list of all uncollectable objects as follows: import gc print gc.garbage To see what cycles these objects participate in, there is a useful function in matplotlib.cbook: from matplotlib import cbook cbook.print_cycles(gc.garbage) This will print out all of the reference cycles that are preventing the uncollectable objects from being freed. The code should then be modified to prevent these cycles, or break the cycles during destruction. * *Real references:* Sometimes objects are legitimately being held onto by other Python objects. When this happens, you would see the total number of Python objects in the interpreter increase with each iteration of your test, even when you didn't intend any data to "stick around". You can print out the total number of objects in the interpreter: import gc print len(gc.get_objects()) By comparing the objects before and after your test, you can determine which of them remain between iterations: original_objects = [id(x) for x in gc.get_objects()] # ... do something that leaks objects new_objects = [x for x in gc.get_objects() if id(x) not in original_objects] You can then determine what is referencing those objects and causing them to stick around: print gc.get_referents(x) The code should be modified to prevent these unwanted references. * *C/C++ leaks in extension objects:* These is the classic problem of objects that are "malloc'd/new'd" and never "freed/deleted" in C or C++ parlance. There are many memory debuggers available such as Purify or Valgrind to help find these errors. I recommend reading the documentation about using these tools in conjunction with Python <http://svn.python.org/view/python/trunk/Misc/README.valgrind?view=markup> to avoid a lot of false positives. |