From: Caleb C. <cad...@gm...> - 2010-11-18 19:11:43
|
Matplotlib Users: It seems matplotlib plotting has a relatively small memory leak. My experiments suggest it leaks between 5K and 8K bytes of RAM for ever plot redraw. For example, in one experiment, plotting the same buffer (so as to not allocate new memory) every second for a period of about 12 hours resulted in memory usage (physical RAM) increasing by approximately 223MB, which is about 5.3K per replot. The plotting code is: class PlotPanel(wx.Panel): def __init__(self, parent): wx.Panel.__init__(self, parent, wx.ID_ANY, style=wx.BORDER_THEME|wx.TAB_TRAVERSAL) self._figure = MplFigure(dpi=None) self._canvas = MplCanvas(self, -1, self._figure) self._axes = self._figure.add_subplot(1,1,1) sizer = wx.BoxSizer(wx.VERTICAL) sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5) self.SetSizer(sizer) def draw(self, channel, seconds): self._axes.clear() self._axes.plot(channel, seconds) self._canvas.draw() `draw()` is called every second with the same `channels` and `seconds` numpy.array buffers. In my case, this leak, though relatively small, becomes a serious issue since my software often runs for long periods of time (days) plotting data streamed from a data acquisition unit. Any suggestions will help. Am I miss understanding something here? Maybe I need to call some obscure function to free memory, or something? My testing environment: * Windws XP SP3, Intel Core 2 Duo @ 2.33GHz, 1.96 GB RAM * Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on win32 * matplotlib version 1.0.0 * numpy 1.4.1 * wxPython version 2.8.11.0 The complete test program follows. Thanks, Caleb from random import random from datetime import datetime import os import time import win32api import win32con import win32process import wx import numpy import matplotlib as mpl from matplotlib.figure import Figure as MplFigure from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg as MplCanvas def get_process_memory_info(process_id): memory = {} process = None try: process = win32api.OpenProcess( win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, False, process_id); if process is not None: return win32process.GetProcessMemoryInfo(process) finally: if process: win32api.CloseHandle(process) return memory meg = 1024.0 * 1024.0 class PlotPanel(wx.Panel): def __init__(self, parent): wx.Panel.__init__(self, parent, wx.ID_ANY, style=wx.BORDER_THEME|wx.TAB_TRAVERSAL) self._figure = MplFigure(dpi=None) self._canvas = MplCanvas(self, -1, self._figure) self._axes = self._figure.add_subplot(1,1,1) sizer = wx.BoxSizer(wx.VERTICAL) sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5) self.SetSizer(sizer) def draw(self, channel, seconds): self._axes.clear() self._axes.plot(channel, seconds) self._canvas.draw() class TestFrame(wx.Frame): def __init__(self, parent, id, title): wx.Frame.__init__( self, parent, id, title, wx.DefaultPosition, (600, 400)) self.testDuration = 60 * 60 * 24 self.startTime = 0 self.channel = numpy.sin(numpy.arange(1000) * random()) self.seconds = numpy.arange(len(self.channel)) self.plotPanel = PlotPanel(self) sizer = wx.BoxSizer(wx.VERTICAL) sizer.Add(self.plotPanel, 1 ,wx.EXPAND) self.SetSizer(sizer) self._timer = wx.Timer(self) self.Bind(wx.EVT_TIMER, self._onTimer, self._timer) self._timer.Start(1000) print "starting memory: ",\ get_process_memory_info(os.getpid())["WorkingSetSize"]/meg def _onTimer(self, evt): if self.startTime == 0: self.startTime = time.time() if (time.time() - self.startTime) >= self.testDuration: self._timer.Stop() self.plotPanel.draw(self.channel, self.seconds) t = datetime.now() memory = get_process_memory_info(os.getpid()) print "time: {0}, working: {1:f}".format( t, memory["WorkingSetSize"]/meg) class MyApp(wx.App): def OnInit(self): frame = TestFrame(None, wx.ID_ANY, "Memory Leak") self.SetTopWindow(frame) frame.Show(True) return True if __name__ == '__main__': app = MyApp(0) app.MainLoop() |
From: Benjamin R. <ben...@ou...> - 2010-11-18 20:21:07
|
On Thu, Nov 18, 2010 at 1:11 PM, Caleb Constantine <cad...@gm...>wrote: > Matplotlib Users: > > It seems matplotlib plotting has a relatively small memory leak. My > experiments suggest it leaks between 5K and 8K bytes of RAM for ever plot > redraw. For example, in one experiment, plotting the same buffer (so as to > not > allocate new memory) every second for a period of about 12 hours resulted > in > memory usage (physical RAM) increasing by approximately 223MB, which is > about > 5.3K per replot. The plotting code is: > > class PlotPanel(wx.Panel): > def __init__(self, parent): > wx.Panel.__init__(self, parent, wx.ID_ANY, > style=wx.BORDER_THEME|wx.TAB_TRAVERSAL) > self._figure = MplFigure(dpi=None) > self._canvas = MplCanvas(self, -1, self._figure) > self._axes = self._figure.add_subplot(1,1,1) > > sizer = wx.BoxSizer(wx.VERTICAL) > sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5) > self.SetSizer(sizer) > > def draw(self, channel, seconds): > self._axes.clear() > self._axes.plot(channel, seconds) > self._canvas.draw() > > > `draw()` is called every second with the same `channels` and `seconds` > numpy.array buffers. > > In my case, this leak, though relatively small, becomes a serious issue > since > my software often runs for long periods of time (days) plotting data > streamed > from a data acquisition unit. > > Any suggestions will help. Am I miss understanding something here? Maybe I > need to call some obscure function to free memory, or something? > > My testing environment: > > * Windws XP SP3, Intel Core 2 Duo @ 2.33GHz, 1.96 GB RAM > * Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit > (Intel)] on win32 > * matplotlib version 1.0.0 > * numpy 1.4.1 > * wxPython version 2.8.11.0 > > The complete test program follows. > > Thanks, > > Caleb > > > from random import random > from datetime import datetime > import os > import time > import win32api > import win32con > import win32process > > import wx > import numpy > > import matplotlib as mpl > from matplotlib.figure import Figure as MplFigure > from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg as > MplCanvas > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > class PlotPanel(wx.Panel): > def __init__(self, parent): > wx.Panel.__init__(self, parent, wx.ID_ANY, > style=wx.BORDER_THEME|wx.TAB_TRAVERSAL) > self._figure = MplFigure(dpi=None) > self._canvas = MplCanvas(self, -1, self._figure) > self._axes = self._figure.add_subplot(1,1,1) > > sizer = wx.BoxSizer(wx.VERTICAL) > sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5) > self.SetSizer(sizer) > > def draw(self, channel, seconds): > self._axes.clear() > self._axes.plot(channel, seconds) > self._canvas.draw() > > class TestFrame(wx.Frame): > def __init__(self, parent, id, title): > wx.Frame.__init__( > self, parent, id, title, wx.DefaultPosition, (600, 400)) > > self.testDuration = 60 * 60 * 24 > self.startTime = 0 > > self.channel = numpy.sin(numpy.arange(1000) * random()) > self.seconds = numpy.arange(len(self.channel)) > > self.plotPanel = PlotPanel(self) > > sizer = wx.BoxSizer(wx.VERTICAL) > sizer.Add(self.plotPanel, 1 ,wx.EXPAND) > self.SetSizer(sizer) > > self._timer = wx.Timer(self) > self.Bind(wx.EVT_TIMER, self._onTimer, self._timer) > self._timer.Start(1000) > print "starting memory: ",\ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > def _onTimer(self, evt): > if self.startTime == 0: > self.startTime = time.time() > > if (time.time() - self.startTime) >= self.testDuration: > self._timer.Stop() > > self.plotPanel.draw(self.channel, self.seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}".format( > t, memory["WorkingSetSize"]/meg) > > class MyApp(wx.App): > def OnInit(self): > frame = TestFrame(None, wx.ID_ANY, "Memory Leak") > self.SetTopWindow(frame) > frame.Show(True) > return True > > if __name__ == '__main__': > app = MyApp(0) > app.MainLoop() > > > Caleb, Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare. Another possibility might be related to numpy. However this is the draw statement, so I don't know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows? Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak. Thanks for your observations, Ben Root |
From: John H. <jd...@gm...> - 2010-11-18 23:11:00
|
On Thu, Nov 18, 2010 at 2:20 PM, Benjamin Root <ben...@ou...> wrote: > Interesting analysis. One possible source of a leak would be some sort of > dangling reference that still hangs around even though the plot objects have > been cleared. By the time of the matplotlib 1.0.0 release, we did seem to > clear out pretty much all of these, but it is possible there are still some > lurking about. We should probably run your script against the latest svn to > see how the results compare. In our experience, many of the GUI backends have some leak, and these are in the GUI and not in mpl. Caleb, can you see if you can replicate the leak with your example code using the agg backend (no GUI). If so, could you post the code that exposes the leak. if not, I'm afraid it is in wx and you might need to deal with the wx developers. JDH |
From: Robert K. <rob...@gm...> - 2010-11-18 23:36:00
|
On 11/18/10 5:05 PM, John Hunter wrote: > On Thu, Nov 18, 2010 at 2:20 PM, Benjamin Root<ben...@ou...> wrote: > >> Interesting analysis. One possible source of a leak would be some sort of >> dangling reference that still hangs around even though the plot objects have >> been cleared. By the time of the matplotlib 1.0.0 release, we did seem to >> clear out pretty much all of these, but it is possible there are still some >> lurking about. We should probably run your script against the latest svn to >> see how the results compare. > > In our experience, many of the GUI backends have some leak, and these > are in the GUI and not in mpl. Caleb, can you see if you can > replicate the leak with your example code using the agg backend (no > GUI). If so, could you post the code that exposes the leak. if not, > I'm afraid it is in wx and you might need to deal with the wx > developers. Heh. Good timing! I just fixed a bug in Chaco involving a leaking cycle that the garbage collector could not clean up. The lesson of my tale of woe is that even if there is no leak when you run without wxPython, that doesn't mean that wxPython is the culprit. If any object in the connected graph containing a cycle (even if it does not directly participate in the cycle) has an __del__ method in pure Python, then the garbage collector will not clean up that cycle for safety reasons. Read the docs for the gc module for details. We use SWIG to wrap Agg and SWIG adds __del__ methods for all of its classes. wxPython uses SWIG and has the same problems. If there is a cycle which can reach a wxPython object, the cycle will leak. The actual cycle may be created by matplotlib, though. You can determine if this is the case pretty easily, though. Call gc.collect() then examine the list gc.garbage. This will contain all of those objects with a __del__ that prevented a cycle from being collected. I recommend using objgraph to diagram the graph of references to those objects. It's invaluable to actually see what's going on. http://pypi.python.org/pypi/objgraph -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Caleb C. <cad...@gm...> - 2010-11-19 21:14:19
Attachments:
tkagg_memory_usage.png
agg_memory_usage.png
|
On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben...@ou...> wrote: > > Caleb, > > Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare. > > Another possibility might be related to numpy. However this is the draw statement, so I don't know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows? > > Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak. > > Thanks for your observations, > Ben Root > Sorry for the double post; it seems the first is not displaying correctly on SourceForge. I conducted a couple more experiments taking into consideration suggestions made in responses to my original post (thanks for the response). First, I ran my original test (as close to it as possible anyway) using the Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage increased by 86MB. That's about 5.3K per redraw. Very similar to my original experiment. As suggested, I called gc.collect() after each iteration. It returned 67 for every iteration (no increase), although len(gc.garbage) reported 0 each iteration. Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory usage fluctuated over time, but essentially did not increase: starting at 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration as did len(gc.garbage). Attached are images of plots showing change in memory usage over time for each experiment. Any comments would be appreciated. Following is the code for each experiment. Agg ----- from random import random from datetime import datetime import os import gc import time import win32api import win32con import win32process import numpy import matplotlib matplotlib.use("Agg") from matplotlib.figure import Figure from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas def get_process_memory_info(process_id): memory = {} process = None try: process = win32api.OpenProcess( win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, False, process_id); if process is not None: return win32process.GetProcessMemoryInfo(process) finally: if process: win32api.CloseHandle(process) return memory meg = 1024.0 * 1024.0 figure = Figure(dpi=None) canvas = FigureCanvas(figure) axes = figure.add_subplot(1,1,1) def draw(channel, seconds): axes.clear() axes.plot(channel, seconds) canvas.print_figure('test.png') channel = numpy.sin(numpy.arange(1000) * random()) seconds = numpy.arange(len(channel)) testDuration = 60 * 60 * 3 startTime = time.time() print "starting memory: ", \ get_process_memory_info(os.getpid())["WorkingSetSize"]/meg while (time.time() - startTime) < testDuration: draw(channel, seconds) t = datetime.now() memory = get_process_memory_info(os.getpid()) print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( t, memory["WorkingSetSize"]/meg, gc.collect(), len(gc.garbage) ) time.sleep(0.5) TkAgg --------- from random import random from datetime import datetime import sys import os import gc import time import win32api import win32con import win32process import numpy import matplotlib matplotlib.use("TkAgg") from matplotlib.figure import Figure from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ as FigureCanvas import Tkinter as tk def get_process_memory_info(process_id): memory = {} process = None try: process = win32api.OpenProcess( win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, False, process_id); if process is not None: return win32process.GetProcessMemoryInfo(process) finally: if process: win32api.CloseHandle(process) return memory meg = 1024.0 * 1024.0 rootTk = tk.Tk() rootTk.wm_title("TKAgg Memory Leak") figure = Figure() canvas = FigureCanvas(figure, master=rootTk) axes = figure.add_subplot(1,1,1) def draw(channel, seconds): axes.clear() axes.plot(channel, seconds) channel = numpy.sin(numpy.arange(1000) * random()) seconds = numpy.arange(len(channel)) testDuration = 60 * 60 * 3 startTime = time.time() print "starting memory: ", \ get_process_memory_info(os.getpid())["WorkingSetSize"]/meg draw(channel, seconds) canvas.show() canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) rate = 500 def on_tick(): canvas.get_tk_widget().after(rate, on_tick) if (time.time() - startTime) >= testDuration: return draw(channel, seconds) t = datetime.now() memory = get_process_memory_info(os.getpid()) print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( t, memory["WorkingSetSize"]/meg, gc.collect(), len(gc.garbage) ) canvas.get_tk_widget().after(rate, on_tick) tk.mainloop() |
From: Michael D. <md...@st...> - 2010-11-22 16:13:30
Attachments:
cxx_memleak.patch
|
Caleb, Thanks for doing all of this investigation and providing something easy to reproduce. With the help of valgrind, I believe I've tracked it down to a bug in PyCXX, the Python/C++ interface tool matplotlib uses. I have attached a patch that seems to remove the leak for me, but as I'm not a PyCXX expert, I'm not comfortable with committing it to the repository just yet. *I'm hoping you and/or some other developers could test it on their systems (a fully clean re-build is required) and report any problems back. *I also plan to raise this question on the PyCXX mailing list to get any thoughts they may have. Cheers, Mike On 11/19/2010 04:14 PM, Caleb Constantine wrote: > On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root<ben...@ou...> wrote: >> Caleb, >> >> Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare. >> >> Another possibility might be related to numpy. However this is the draw statement, so I don't know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows? >> >> Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak. >> >> Thanks for your observations, >> Ben Root >> > Sorry for the double post; it seems the first is not displaying > correctly on SourceForge. > > I conducted a couple more experiments taking into consideration suggestions > made in responses to my original post (thanks for the response). > > First, I ran my original test (as close to it as possible anyway) using the > Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage > increased by 86MB. That's about 5.3K per redraw. Very similar to my original > experiment. As suggested, I called gc.collect() after each iteration. It > returned 67 for every iteration (no increase), although len(gc.garbage) > reported 0 each iteration. > > Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory > usage fluctuated over time, but essentially did not increase: starting at > 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration > as did len(gc.garbage). > > Attached are images of plots showing change in memory usage over time for each > experiment. > > Any comments would be appreciated. > > Following is the code for each experiment. > > Agg > ----- > > from random import random > from datetime import datetime > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("Agg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > figure = Figure(dpi=None) > canvas = FigureCanvas(figure) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > canvas.print_figure('test.png') > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > while (time.time() - startTime)< testDuration: > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > time.sleep(0.5) > > > TkAgg > --------- > from random import random > from datetime import datetime > import sys > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("TkAgg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ > as FigureCanvas > > import Tkinter as tk > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > rootTk = tk.Tk() > rootTk.wm_title("TKAgg Memory Leak") > > figure = Figure() > canvas = FigureCanvas(figure, master=rootTk) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > draw(channel, seconds) > canvas.show() > canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) > > rate = 500 > > def on_tick(): > canvas.get_tk_widget().after(rate, on_tick) > > if (time.time() - startTime)>= testDuration: > return > > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > canvas.get_tk_widget().after(rate, on_tick) > tk.mainloop() > > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2& L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > > > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users |
From: Benjamin R. <ben...@ou...> - 2010-11-22 16:15:47
|
On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine <cad...@gm...>wrote: > On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben...@ou...> wrote: > > > > Caleb, > > > > Interesting analysis. One possible source of a leak would be some sort > of dangling reference that still hangs around even though the plot objects > have been cleared. By the time of the matplotlib 1.0.0 release, we did seem > to clear out pretty much all of these, but it is possible there are still > some lurking about. We should probably run your script against the latest > svn to see how the results compare. > > > > Another possibility might be related to numpy. However this is the draw > statement, so I don't know how much numpy is used in there. The latest > refactor work in numpy has revealed some memory leaks that have existed, so > who knows? > > > > Might be interesting to try making equivalent versions of this script > using different backends, and different package versions to possibly isolate > the source of the memory leak. > > > > Thanks for your observations, > > Ben Root > > > > Sorry for the double post; it seems the first is not displaying > correctly on SourceForge. > > I conducted a couple more experiments taking into consideration suggestions > made in responses to my original post (thanks for the response). > > First, I ran my original test (as close to it as possible anyway) using the > Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage > increased by 86MB. That's about 5.3K per redraw. Very similar to my > original > experiment. As suggested, I called gc.collect() after each iteration. It > returned 67 for every iteration (no increase), although len(gc.garbage) > reported 0 each iteration. > > Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. > Memory > usage fluctuated over time, but essentially did not increase: starting at > 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration > as did len(gc.garbage). > > Attached are images of plots showing change in memory usage over time for > each > experiment. > > Any comments would be appreciated. > > Following is the code for each experiment. > > Agg > ----- > > from random import random > from datetime import datetime > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("Agg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > figure = Figure(dpi=None) > canvas = FigureCanvas(figure) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > canvas.print_figure('test.png') > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > while (time.time() - startTime) < testDuration: > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > time.sleep(0.5) > > > TkAgg > --------- > from random import random > from datetime import datetime > import sys > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("TkAgg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ > as FigureCanvas > > import Tkinter as tk > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > rootTk = tk.Tk() > rootTk.wm_title("TKAgg Memory Leak") > > figure = Figure() > canvas = FigureCanvas(figure, master=rootTk) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > draw(channel, seconds) > canvas.show() > canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) > > rate = 500 > > def on_tick(): > canvas.get_tk_widget().after(rate, on_tick) > > if (time.time() - startTime) >= testDuration: > return > > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > canvas.get_tk_widget().after(rate, on_tick) > tk.mainloop() > Interesting results. I would like to try these tests on a Linux machine to see if there is a difference, but I don't know what the equivalent functions would be to some of the win32 calls. Does anybody have a reference for such things? Ben Root |
From: Eric F. <ef...@ha...> - 2010-11-22 17:32:53
|
On 11/22/2010 06:15 AM, Benjamin Root wrote: > On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine > <cad...@gm... <mailto:cad...@gm...>> wrote: > > On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > > > Caleb, > > > > Interesting analysis. One possible source of a leak would be > some sort of dangling reference that still hangs around even though > the plot objects have been cleared. By the time of the matplotlib > 1.0.0 release, we did seem to clear out pretty much all of these, > but it is possible there are still some lurking about. We should > probably run your script against the latest svn to see how the > results compare. > > > > Another possibility might be related to numpy. However this is > the draw statement, so I don't know how much numpy is used in there. > The latest refactor work in numpy has revealed some memory leaks > that have existed, so who knows? > > > > Might be interesting to try making equivalent versions of this > script using different backends, and different package versions to > possibly isolate the source of the memory leak. > > > > Thanks for your observations, > > Ben Root > > > > Sorry for the double post; it seems the first is not displaying > correctly on SourceForge. > > I conducted a couple more experiments taking into consideration > suggestions > made in responses to my original post (thanks for the response). > > First, I ran my original test (as close to it as possible anyway) > using the > Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory > usage > increased by 86MB. That's about 5.3K per redraw. Very similar to my > original > experiment. As suggested, I called gc.collect() after each iteration. It > returned 67 for every iteration (no increase), although len(gc.garbage) > reported 0 each iteration. > > Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 > times. Memory > usage fluctuated over time, but essentially did not increase: > starting at > 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each > iteration > as did len(gc.garbage). > > Attached are images of plots showing change in memory usage over > time for each > experiment. > > Any comments would be appreciated. > > Following is the code for each experiment. > > Agg > ----- > > from random import random > from datetime import datetime > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("Agg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_agg import FigureCanvasAgg as > FigureCanvas > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > figure = Figure(dpi=None) > canvas = FigureCanvas(figure) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > canvas.print_figure('test.png') > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > while (time.time() - startTime) < testDuration: > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: > {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > time.sleep(0.5) > > > TkAgg > --------- > from random import random > from datetime import datetime > import sys > import os > import gc > import time > import win32api > import win32con > import win32process > > import numpy > > import matplotlib > matplotlib.use("TkAgg") > from matplotlib.figure import Figure > from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ > as FigureCanvas > > import Tkinter as tk > > def get_process_memory_info(process_id): > memory = {} > process = None > try: > process = win32api.OpenProcess( > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > False, process_id); > if process is not None: > return win32process.GetProcessMemoryInfo(process) > finally: > if process: > win32api.CloseHandle(process) > return memory > > meg = 1024.0 * 1024.0 > > rootTk = tk.Tk() > rootTk.wm_title("TKAgg Memory Leak") > > figure = Figure() > canvas = FigureCanvas(figure, master=rootTk) > axes = figure.add_subplot(1,1,1) > > def draw(channel, seconds): > axes.clear() > axes.plot(channel, seconds) > > channel = numpy.sin(numpy.arange(1000) * random()) > seconds = numpy.arange(len(channel)) > > testDuration = 60 * 60 * 3 > startTime = time.time() > > print "starting memory: ", \ > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > draw(channel, seconds) > canvas.show() > canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) > > rate = 500 > > def on_tick(): > canvas.get_tk_widget().after(rate, on_tick) > > if (time.time() - startTime) >= testDuration: > return > > draw(channel, seconds) > > t = datetime.now() > memory = get_process_memory_info(os.getpid()) > print "time: {0}, working: {1:f}, collect: {2}, garbage: > {3}".format( > t, > memory["WorkingSetSize"]/meg, > gc.collect(), > len(gc.garbage) ) > > canvas.get_tk_widget().after(rate, on_tick) > tk.mainloop() > > > Interesting results. I would like to try these tests on a Linux machine > to see if there is a difference, but I don't know what the equivalent > functions would be to some of the win32 calls. Does anybody have a > reference for such things? Do you need win32 calls, or do you just need to read the memory usage? If the latter, see cbook.report_memory(). Eric > > Ben Root |
From: Benjamin R. <ben...@ou...> - 2010-11-22 18:47:15
|
On Mon, Nov 22, 2010 at 11:32 AM, Eric Firing <ef...@ha...> wrote: > On 11/22/2010 06:15 AM, Benjamin Root wrote: > > On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine > > <cad...@gm... <mailto:cad...@gm...>> wrote: > > > > On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben...@ou... > > <mailto:ben...@ou...>> wrote: > > > > > > Caleb, > > > > > > Interesting analysis. One possible source of a leak would be > > some sort of dangling reference that still hangs around even though > > the plot objects have been cleared. By the time of the matplotlib > > 1.0.0 release, we did seem to clear out pretty much all of these, > > but it is possible there are still some lurking about. We should > > probably run your script against the latest svn to see how the > > results compare. > > > > > > Another possibility might be related to numpy. However this is > > the draw statement, so I don't know how much numpy is used in there. > > The latest refactor work in numpy has revealed some memory leaks > > that have existed, so who knows? > > > > > > Might be interesting to try making equivalent versions of this > > script using different backends, and different package versions to > > possibly isolate the source of the memory leak. > > > > > > Thanks for your observations, > > > Ben Root > > > > > > > Sorry for the double post; it seems the first is not displaying > > correctly on SourceForge. > > > > I conducted a couple more experiments taking into consideration > > suggestions > > made in responses to my original post (thanks for the response). > > > > First, I ran my original test (as close to it as possible anyway) > > using the > > Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory > > usage > > increased by 86MB. That's about 5.3K per redraw. Very similar to my > > original > > experiment. As suggested, I called gc.collect() after each iteration. > It > > returned 67 for every iteration (no increase), although > len(gc.garbage) > > reported 0 each iteration. > > > > Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 > > times. Memory > > usage fluctuated over time, but essentially did not increase: > > starting at > > 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each > > iteration > > as did len(gc.garbage). > > > > Attached are images of plots showing change in memory usage over > > time for each > > experiment. > > > > Any comments would be appreciated. > > > > Following is the code for each experiment. > > > > Agg > > ----- > > > > from random import random > > from datetime import datetime > > import os > > import gc > > import time > > import win32api > > import win32con > > import win32process > > > > import numpy > > > > import matplotlib > > matplotlib.use("Agg") > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_agg import FigureCanvasAgg as > > FigureCanvas > > > > def get_process_memory_info(process_id): > > memory = {} > > process = None > > try: > > process = win32api.OpenProcess( > > > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > > False, process_id); > > if process is not None: > > return win32process.GetProcessMemoryInfo(process) > > finally: > > if process: > > win32api.CloseHandle(process) > > return memory > > > > meg = 1024.0 * 1024.0 > > > > figure = Figure(dpi=None) > > canvas = FigureCanvas(figure) > > axes = figure.add_subplot(1,1,1) > > > > def draw(channel, seconds): > > axes.clear() > > axes.plot(channel, seconds) > > canvas.print_figure('test.png') > > > > channel = numpy.sin(numpy.arange(1000) * random()) > > seconds = numpy.arange(len(channel)) > > testDuration = 60 * 60 * 3 > > startTime = time.time() > > > > print "starting memory: ", \ > > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > > > while (time.time() - startTime) < testDuration: > > draw(channel, seconds) > > > > t = datetime.now() > > memory = get_process_memory_info(os.getpid()) > > print "time: {0}, working: {1:f}, collect: {2}, garbage: > > {3}".format( > > t, > > memory["WorkingSetSize"]/meg, > > gc.collect(), > > len(gc.garbage) ) > > > > time.sleep(0.5) > > > > > > TkAgg > > --------- > > from random import random > > from datetime import datetime > > import sys > > import os > > import gc > > import time > > import win32api > > import win32con > > import win32process > > > > import numpy > > > > import matplotlib > > matplotlib.use("TkAgg") > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ > > as FigureCanvas > > > > import Tkinter as tk > > > > def get_process_memory_info(process_id): > > memory = {} > > process = None > > try: > > process = win32api.OpenProcess( > > > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > > False, process_id); > > if process is not None: > > return win32process.GetProcessMemoryInfo(process) > > finally: > > if process: > > win32api.CloseHandle(process) > > return memory > > > > meg = 1024.0 * 1024.0 > > > > rootTk = tk.Tk() > > rootTk.wm_title("TKAgg Memory Leak") > > > > figure = Figure() > > canvas = FigureCanvas(figure, master=rootTk) > > axes = figure.add_subplot(1,1,1) > > > > def draw(channel, seconds): > > axes.clear() > > axes.plot(channel, seconds) > > > > channel = numpy.sin(numpy.arange(1000) * random()) > > seconds = numpy.arange(len(channel)) > > > > testDuration = 60 * 60 * 3 > > startTime = time.time() > > > > print "starting memory: ", \ > > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > > > draw(channel, seconds) > > canvas.show() > > canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) > > > > rate = 500 > > > > def on_tick(): > > canvas.get_tk_widget().after(rate, on_tick) > > > > if (time.time() - startTime) >= testDuration: > > return > > > > draw(channel, seconds) > > > > t = datetime.now() > > memory = get_process_memory_info(os.getpid()) > > print "time: {0}, working: {1:f}, collect: {2}, garbage: > > {3}".format( > > t, > > memory["WorkingSetSize"]/meg, > > gc.collect(), > > len(gc.garbage) ) > > > > canvas.get_tk_widget().after(rate, on_tick) > > tk.mainloop() > > > > > > Interesting results. I would like to try these tests on a Linux machine > > to see if there is a difference, but I don't know what the equivalent > > functions would be to some of the win32 calls. Does anybody have a > > reference for such things? > > Do you need win32 calls, or do you just need to read the memory usage? > If the latter, see cbook.report_memory(). > > Eric > > > > > Ben Root > > I tried out the script using cbook.report_memory() with and without the patch. The patch certainly made the leak *much* slower. I am still finding a very slow leak at approximately 0.03226 MiB per 5 minutes in the resident set size. Ben Root |
From: Caleb C. <cad...@gm...> - 2011-04-19 14:25:18
|
This picks up from a thread of the same name between 18 Nov 2010 and 22 Nov 2010. Release 1.0.1 of matplotlib has made significant gains in reducing the memory leak (thanks!!), but it did not eliminate the problem entirely. Recall, the TkAgg back-end does not have any leak, so we know this particular leak is in matplotlib or wxPython. Here are the results of some tests. Matplotlib 1.0.0 - 1 hour - Plotted 3595 times, about 1Hz - Memory usage increased by about 18.7MB (59.96 - 41.25), or about 5.3K per redraw. Matplotlib 1.0.1 - 1 hour - Plotted 3601 times, about 1Hz - Memory usage increased by about 1.4MB (42.98 - 41.59), or about 0.40K per redraw. - 12 hour - Plotted 43201 times, about 1Hz - Memory usage increased by about 13.3MB (54.32 - 41.01), or about 0.32K per redraw. As stated before, for a process plotting data for long periods of time, this becomes an issue. Caleb |
From: Michael D. <md...@st...> - 2011-04-19 15:32:57
|
There's a lot of moving parts here. Running your script again is showing some leaks in valgrind that weren't there before, but a number of the underlying libraries have changed on my system since then (memory leaks tend to be Whac-a-mole sometimes...) Which versions of the following are you running, and on what platform -- some variant of MS-Windows if I recall correctly? Python Numpy wxPython Tkinter Mike On 04/19/2011 10:25 AM, Caleb Constantine wrote: > This picks up from a thread of the same name between 18 Nov 2010 and > 22 Nov 2010. > > Release 1.0.1 of matplotlib has made significant gains in reducing the > memory leak (thanks!!), but it did not > eliminate the problem entirely. Recall, the TkAgg back-end does not > have any leak, so we know this particular > leak is in matplotlib or wxPython. > > Here are the results of some tests. > > Matplotlib 1.0.0 > > - 1 hour > - Plotted 3595 times, about 1Hz > - Memory usage increased by about 18.7MB (59.96 - 41.25), or about > 5.3K per redraw. > > Matplotlib 1.0.1 > > - 1 hour > - Plotted 3601 times, about 1Hz > - Memory usage increased by about 1.4MB (42.98 - 41.59), or about > 0.40K per redraw. > > - 12 hour > - Plotted 43201 times, about 1Hz > - Memory usage increased by about 13.3MB (54.32 - 41.01), or about > 0.32K per redraw. > > As stated before, for a process plotting data for long periods of > time, this becomes an issue. > > Caleb > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > -- Michael Droettboom Science Software Branch Space Telescope Science Institute Baltimore, Maryland, USA |
From: Caleb C. <cad...@gm...> - 2011-04-19 16:34:28
|
On Tue, Apr 19, 2011 at 1:01 PM, Michael Droettboom <md...@st...> wrote: > There's a lot of moving parts here. Running your script again is > showing some leaks in valgrind that weren't there before, but a number > of the underlying libraries have changed on my system since then (memory > leaks tend to be Whac-a-mole sometimes...) > > Which versions of the following are you running, and on what platform -- > some variant of MS-Windows if I recall correctly? > > Python > Numpy > wxPython > Tkinter Windows XP SP 3 Python - 2.6.6 Numpy - 1.4.1 wxPython - 2.8.11.0 Tkinter - $Revision: 73770 $ I'll install new versions of Numpy and wxPython (and maybe Python) and try again. |
From: Michael D. <md...@st...> - 2011-04-19 16:57:45
|
Ok. I have a RHEL5 Linux box with Python 2.7.1. With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, I did see a leak -- I submitted a pull request to Numpy here: https://github.com/numpy/numpy/pull/76 I get the same results (no leaks) running your wx, tk and agg scripts (with the Windows-specific stuff removed). FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. So the variables are the platform and the version of Python. Perhaps it's one of those two things? Mike On 04/19/2011 12:34 PM, Caleb Constantine wrote: > On Tue, Apr 19, 2011 at 1:01 PM, Michael Droettboom<md...@st...> wrote: > >> There's a lot of moving parts here. Running your script again is >> showing some leaks in valgrind that weren't there before, but a number >> of the underlying libraries have changed on my system since then (memory >> leaks tend to be Whac-a-mole sometimes...) >> >> Which versions of the following are you running, and on what platform -- >> some variant of MS-Windows if I recall correctly? >> >> Python >> Numpy >> wxPython >> Tkinter >> > Windows XP SP 3 > Python - 2.6.6 > Numpy - 1.4.1 > wxPython - 2.8.11.0 > Tkinter - $Revision: 73770 $ > > I'll install new versions of Numpy and wxPython (and maybe Python) and > try again. > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > -- Michael Droettboom Science Software Branch Space Telescope Science Institute Baltimore, Maryland, USA |
From: Caleb C. <cad...@gm...> - 2011-04-20 11:48:19
|
On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom <md...@st...> wrote: > Ok. I have a RHEL5 Linux box with Python 2.7.1. > > With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, > I did see a leak -- I submitted a pull request to Numpy here: > > https://github.com/numpy/numpy/pull/76 > > I get the same results (no leaks) running your wx, tk and agg scripts > (with the Windows-specific stuff removed). > > FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. > > So the variables are the platform and the version of Python. Perhaps > it's one of those two things? > > Mike Consider the following: matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 - 1 hour - Plotted 3601 times, about 1Hz - Memory usage increased by about 1.16MB (41.39 - 40.23), or about 0.33K per redraw It seems the same memory leak exists. Given you don't have this issue on Linux with the same Python configuration, I can only assume it is related to some Windows specific code somewhere. I'll run for a longer period of time just in case, but I don't expect the results to be different. |
From: Michael D. <md...@st...> - 2011-04-20 12:00:40
|
On 04/20/2011 07:48 AM, Caleb Constantine wrote: > On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<md...@st...> wrote: >> Ok. I have a RHEL5 Linux box with Python 2.7.1. >> >> With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, >> I did see a leak -- I submitted a pull request to Numpy here: >> >> https://github.com/numpy/numpy/pull/76 >> >> I get the same results (no leaks) running your wx, tk and agg scripts >> (with the Windows-specific stuff removed). >> >> FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. >> >> So the variables are the platform and the version of Python. Perhaps >> it's one of those two things? >> >> Mike > Consider the following: > > matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, > Windows XP SP3 > > - 1 hour > - Plotted 3601 times, about 1Hz > - Memory usage increased by about 1.16MB (41.39 - 40.23), or > about 0.33K per redraw > > It seems the same memory leak exists. Given you don't have this issue > on Linux with the same Python configuration, I can only assume it is > related to some Windows specific code somewhere. I'll run for a longer > period of time just in case, but I don't expect the results to be > different. One way to rule out Windows-specific code may be to run with the Agg backend only (without wx). Have you plotted the memory growth? This amount of memory growth is well within the pool allocation sizes that Python routinely uses. Does the value of len(gc.get_objects()) grow over time? Mike |
From: Caleb C. <cad...@gm...> - 2011-04-20 15:28:06
|
On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom <md...@st...> wrote: > On 04/20/2011 07:48 AM, Caleb Constantine wrote: >> On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<md...@st...> wrote: >>> Ok. I have a RHEL5 Linux box with Python 2.7.1. >>> >>> With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, >>> I did see a leak -- I submitted a pull request to Numpy here: >>> >>> https://github.com/numpy/numpy/pull/76 >>> >>> I get the same results (no leaks) running your wx, tk and agg scripts >>> (with the Windows-specific stuff removed). >>> >>> FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. >>> >>> So the variables are the platform and the version of Python. Perhaps >>> it's one of those two things? >>> >>> Mike >> Consider the following: >> >> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, >> Windows XP SP3 >> >> - 1 hour >> - Plotted 3601 times, about 1Hz >> - Memory usage increased by about 1.16MB (41.39 - 40.23), or >> about 0.33K per redraw >> >> It seems the same memory leak exists. Given you don't have this issue >> on Linux with the same Python configuration, I can only assume it is >> related to some Windows specific code somewhere. I'll run for a longer >> period of time just in case, but I don't expect the results to be >> different. > One way to rule out Windows-specific code may be to run with the Agg > backend only (without wx). Have you plotted the memory growth? This > amount of memory growth is well within the pool allocation sizes that > Python routinely uses. Does the value of len(gc.get_objects()) grow > over time? > New results follows. matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 agg - 3601 redraws (1 hour), about 1Hz - Memory usage: 28.79 - 27.57 = 1.22 MB - len(gc.get_objects()): 23424 at beginning and end - Plot of memory growth: roughly linear, increasing with slope of 0.26KB tkagg - 3601 redraws (1 hour), about 1Hz - Memory usage: 33.22 - 33.32 = -0.1 MB - len(gc.get_objects()): 24182 at beginning and end - Plot of memory growth: very irregular (up and down), but a line fit has a slope of about 0.025KB (I could run longer and see if slope approaches 0) wxagg - 3601 redraws (1 hour), about 1Hz - Memory usage: 43.28 - 41.80 = 1.5 MB - len(gc.get_objects()): 41473 at beginning and end - Plot of memory growth: roughly linear, increasing with slope of 0.32KB |
From: Michael D. <md...@st...> - 2011-04-20 15:35:57
|
On 04/20/2011 11:27 AM, Caleb Constantine wrote: > On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<md...@st...> wrote: >> On 04/20/2011 07:48 AM, Caleb Constantine wrote: >>> On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<md...@st...> wrote: >>>> Ok. I have a RHEL5 Linux box with Python 2.7.1. >>>> >>>> With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, >>>> I did see a leak -- I submitted a pull request to Numpy here: >>>> >>>> https://github.com/numpy/numpy/pull/76 >>>> >>>> I get the same results (no leaks) running your wx, tk and agg scripts >>>> (with the Windows-specific stuff removed). >>>> >>>> FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. >>>> >>>> So the variables are the platform and the version of Python. Perhaps >>>> it's one of those two things? >>>> >>>> Mike >>> Consider the following: >>> >>> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, >>> Windows XP SP3 >>> >>> - 1 hour >>> - Plotted 3601 times, about 1Hz >>> - Memory usage increased by about 1.16MB (41.39 - 40.23), or >>> about 0.33K per redraw >>> >>> It seems the same memory leak exists. Given you don't have this issue >>> on Linux with the same Python configuration, I can only assume it is >>> related to some Windows specific code somewhere. I'll run for a longer >>> period of time just in case, but I don't expect the results to be >>> different. >> One way to rule out Windows-specific code may be to run with the Agg >> backend only (without wx). Have you plotted the memory growth? This >> amount of memory growth is well within the pool allocation sizes that >> Python routinely uses. Does the value of len(gc.get_objects()) grow >> over time? >> > New results follows. > > matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 > > agg > - 3601 redraws (1 hour), about 1Hz > - Memory usage: 28.79 - 27.57 = 1.22 MB > - len(gc.get_objects()): 23424 at beginning and end > - Plot of memory growth: roughly linear, increasing with slope of 0.26KB > > tkagg > - 3601 redraws (1 hour), about 1Hz > - Memory usage: 33.22 - 33.32 = -0.1 MB > - len(gc.get_objects()): 24182 at beginning and end > - Plot of memory growth: very irregular (up and down), but a line fit > has a slope of about 0.025KB (I could run longer and see if slope > approaches 0) > > wxagg > - 3601 redraws (1 hour), about 1Hz > - Memory usage: 43.28 - 41.80 = 1.5 MB > - len(gc.get_objects()): 41473 at beginning and end > - Plot of memory growth: roughly linear, increasing with slope of 0.32KB Thanks. These are very useful results. The fact that gc.get_objects() remains constant suggests to me that this is not a simple case of holding on to a Python reference longer than we intend to. Instead, this is either a C-side reference counting bug, or a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually does a very good job of finding such bugs, but is turning up nothing for me. Will have to scratch my head a little bit longer and see if I can come up with a proper experiment that will help me get to the bottom of this. Cheers, Mike |
From: Caleb C. <cad...@gm...> - 2011-04-21 12:35:38
|
On Wed, Apr 20, 2011 at 1:04 PM, Michael Droettboom <md...@st...> wrote: > On 04/20/2011 11:27 AM, Caleb Constantine wrote: >> On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<md...@st...> wrote: >>> On 04/20/2011 07:48 AM, Caleb Constantine wrote: >>>> On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<md...@st...> wrote: >>>>> Ok. I have a RHEL5 Linux box with Python 2.7.1. >>>>> >>>>> With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, >>>>> I did see a leak -- I submitted a pull request to Numpy here: >>>>> >>>>> https://github.com/numpy/numpy/pull/76 >>>>> >>>>> I get the same results (no leaks) running your wx, tk and agg scripts >>>>> (with the Windows-specific stuff removed). >>>>> >>>>> FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. >>>>> >>>>> So the variables are the platform and the version of Python. Perhaps >>>>> it's one of those two things? >>>>> >>>>> Mike >>>> Consider the following: >>>> >>>> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, >>>> Windows XP SP3 >>>> >>>> - 1 hour >>>> - Plotted 3601 times, about 1Hz >>>> - Memory usage increased by about 1.16MB (41.39 - 40.23), or >>>> about 0.33K per redraw >>>> >>>> It seems the same memory leak exists. Given you don't have this issue >>>> on Linux with the same Python configuration, I can only assume it is >>>> related to some Windows specific code somewhere. I'll run for a longer >>>> period of time just in case, but I don't expect the results to be >>>> different. >>> One way to rule out Windows-specific code may be to run with the Agg >>> backend only (without wx). Have you plotted the memory growth? This >>> amount of memory growth is well within the pool allocation sizes that >>> Python routinely uses. Does the value of len(gc.get_objects()) grow >>> over time? >>> >> New results follows. >> >> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 >> >> agg >> - 3601 redraws (1 hour), about 1Hz >> - Memory usage: 28.79 - 27.57 = 1.22 MB >> - len(gc.get_objects()): 23424 at beginning and end >> - Plot of memory growth: roughly linear, increasing with slope of 0.26KB >> >> tkagg >> - 3601 redraws (1 hour), about 1Hz >> - Memory usage: 33.22 - 33.32 = -0.1 MB >> - len(gc.get_objects()): 24182 at beginning and end >> - Plot of memory growth: very irregular (up and down), but a line fit >> has a slope of about 0.025KB (I could run longer and see if slope >> approaches 0) >> >> wxagg >> - 3601 redraws (1 hour), about 1Hz >> - Memory usage: 43.28 - 41.80 = 1.5 MB >> - len(gc.get_objects()): 41473 at beginning and end >> - Plot of memory growth: roughly linear, increasing with slope of 0.32KB > Thanks. These are very useful results. > > The fact that gc.get_objects() remains constant suggests to me that this > is not a simple case of holding on to a Python reference longer than we > intend to. Instead, this is either a C-side reference counting bug, or > a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually > does a very good job of finding such bugs, but is turning up nothing for > me. Will have to scratch my head a little bit longer and see if I can > come up with a proper experiment that will help me get to the bottom of > this. > For completeness, I ran more tests over a 10 hour period at an increased redraw rate. Details follows. Note tkagg memory usage is flat, agg and wxagg are not. matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 agg - 52214 redraws - Memory usage: 27.55 - 43.46 = 15.22 MB - len(gc.get_objects()): 23424 at beginning and end - Plot of memory growth: linear, increasing with slope of 0.31KB tkagg - 71379 redraws - Memory usage: 30.47 - 30.25 = 0.22 MB - len(gc.get_objects()): 24171 at beginning, 24182 at end, but mostly constant at 24182 - Plot of memory growth: very irregular (up and down), but a line fit has a slope of about 0.0002KB. wxagg - 72001 redraws - Memory usage: 62.08 - 40.10 = 21.98 MB - len(gc.get_objects()): 41473 at beginning and end - Plot of memory growth: linear, increasing with slope of 0.31KB |
From: Michael D. <md...@st...> - 2011-04-21 16:37:25
|
Ok. I think I've found a leak in the way the spines' paths were being updated. https://github.com/matplotlib/matplotlib/pull/89 Can you apply the patch there and let me know how it improves things for you? Cheers, Mike On 04/21/2011 08:35 AM, Caleb Constantine wrote: > On Wed, Apr 20, 2011 at 1:04 PM, Michael Droettboom<md...@st...> wrote: > >> On 04/20/2011 11:27 AM, Caleb Constantine wrote: >> >>> On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<md...@st...> wrote: >>> >>>> On 04/20/2011 07:48 AM, Caleb Constantine wrote: >>>> >>>>> On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<md...@st...> wrote: >>>>> >>>>>> Ok. I have a RHEL5 Linux box with Python 2.7.1. >>>>>> >>>>>> With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, >>>>>> I did see a leak -- I submitted a pull request to Numpy here: >>>>>> >>>>>> https://github.com/numpy/numpy/pull/76 >>>>>> >>>>>> I get the same results (no leaks) running your wx, tk and agg scripts >>>>>> (with the Windows-specific stuff removed). >>>>>> >>>>>> FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008. >>>>>> >>>>>> So the variables are the platform and the version of Python. Perhaps >>>>>> it's one of those two things? >>>>>> >>>>>> Mike >>>>>> >>>>> Consider the following: >>>>> >>>>> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, >>>>> Windows XP SP3 >>>>> >>>>> - 1 hour >>>>> - Plotted 3601 times, about 1Hz >>>>> - Memory usage increased by about 1.16MB (41.39 - 40.23), or >>>>> about 0.33K per redraw >>>>> >>>>> It seems the same memory leak exists. Given you don't have this issue >>>>> on Linux with the same Python configuration, I can only assume it is >>>>> related to some Windows specific code somewhere. I'll run for a longer >>>>> period of time just in case, but I don't expect the results to be >>>>> different. >>>>> >>>> One way to rule out Windows-specific code may be to run with the Agg >>>> backend only (without wx). Have you plotted the memory growth? This >>>> amount of memory growth is well within the pool allocation sizes that >>>> Python routinely uses. Does the value of len(gc.get_objects()) grow >>>> over time? >>>> >>>> >>> New results follows. >>> >>> matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 >>> >>> agg >>> - 3601 redraws (1 hour), about 1Hz >>> - Memory usage: 28.79 - 27.57 = 1.22 MB >>> - len(gc.get_objects()): 23424 at beginning and end >>> - Plot of memory growth: roughly linear, increasing with slope of 0.26KB >>> >>> tkagg >>> - 3601 redraws (1 hour), about 1Hz >>> - Memory usage: 33.22 - 33.32 = -0.1 MB >>> - len(gc.get_objects()): 24182 at beginning and end >>> - Plot of memory growth: very irregular (up and down), but a line fit >>> has a slope of about 0.025KB (I could run longer and see if slope >>> approaches 0) >>> >>> wxagg >>> - 3601 redraws (1 hour), about 1Hz >>> - Memory usage: 43.28 - 41.80 = 1.5 MB >>> - len(gc.get_objects()): 41473 at beginning and end >>> - Plot of memory growth: roughly linear, increasing with slope of 0.32KB >>> >> Thanks. These are very useful results. >> >> The fact that gc.get_objects() remains constant suggests to me that this >> is not a simple case of holding on to a Python reference longer than we >> intend to. Instead, this is either a C-side reference counting bug, or >> a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually >> does a very good job of finding such bugs, but is turning up nothing for >> me. Will have to scratch my head a little bit longer and see if I can >> come up with a proper experiment that will help me get to the bottom of >> this. >> >> > For completeness, I ran more tests over a 10 hour period at an > increased redraw rate. Details follows. Note tkagg memory usage is > flat, agg and wxagg are not. > > matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3 > > agg > - 52214 redraws > - Memory usage: 27.55 - 43.46 = 15.22 MB > - len(gc.get_objects()): 23424 at beginning and end > - Plot of memory growth: linear, increasing with slope of 0.31KB > > tkagg > - 71379 redraws > - Memory usage: 30.47 - 30.25 = 0.22 MB > - len(gc.get_objects()): 24171 at beginning, 24182 at end, but mostly > constant at 24182 > - Plot of memory growth: very irregular (up and down), but a line fit > has a slope of about 0.0002KB. > > wxagg > - 72001 redraws > - Memory usage: 62.08 - 40.10 = 21.98 MB > - len(gc.get_objects()): 41473 at beginning and end > - Plot of memory growth: linear, increasing with slope of 0.31KB > > ------------------------------------------------------------------------------ > Benefiting from Server Virtualization: Beyond Initial Workload > Consolidation -- Increasing the use of server virtualization is a top > priority.Virtualization can reduce costs, simplify management, and improve > application availability and disaster protection. Learn more about boosting > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > -- Michael Droettboom Science Software Branch Space Telescope Science Institute Baltimore, Maryland, USA |
From: Caleb C. <cad...@gm...> - 2011-04-26 13:32:55
|
On Thu, Apr 21, 2011 at 2:07 PM, Michael Droettboom <md...@st...> wrote: > Ok. I think I've found a leak in the way the spines' paths were being > updated. > > https://github.com/matplotlib/matplotlib/pull/89 > > Can you apply the patch there and let me know how it improves things for > you? > > Cheers, > Mike > I applied the patch and (at least for a short run), it seems the leak for wxagg has been fixed. The new results for wxagg is pretty much the same as the old results for tkagg (which does not have the leak): a plot of memory usage is very irregular (up and down), but a line fit has a slope of about 0.009KB. I suspect if I ran this for a few hours the slope would approach 0 KB. Thanks so much!! This is a life saver for my project! Caleb |