From: Benjamin R. <ben...@ou...> - 2010-11-22 18:47:15
|
On Mon, Nov 22, 2010 at 11:32 AM, Eric Firing <ef...@ha...> wrote: > On 11/22/2010 06:15 AM, Benjamin Root wrote: > > On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine > > <cad...@gm... <mailto:cad...@gm...>> wrote: > > > > On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben...@ou... > > <mailto:ben...@ou...>> wrote: > > > > > > Caleb, > > > > > > Interesting analysis. One possible source of a leak would be > > some sort of dangling reference that still hangs around even though > > the plot objects have been cleared. By the time of the matplotlib > > 1.0.0 release, we did seem to clear out pretty much all of these, > > but it is possible there are still some lurking about. We should > > probably run your script against the latest svn to see how the > > results compare. > > > > > > Another possibility might be related to numpy. However this is > > the draw statement, so I don't know how much numpy is used in there. > > The latest refactor work in numpy has revealed some memory leaks > > that have existed, so who knows? > > > > > > Might be interesting to try making equivalent versions of this > > script using different backends, and different package versions to > > possibly isolate the source of the memory leak. > > > > > > Thanks for your observations, > > > Ben Root > > > > > > > Sorry for the double post; it seems the first is not displaying > > correctly on SourceForge. > > > > I conducted a couple more experiments taking into consideration > > suggestions > > made in responses to my original post (thanks for the response). > > > > First, I ran my original test (as close to it as possible anyway) > > using the > > Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory > > usage > > increased by 86MB. That's about 5.3K per redraw. Very similar to my > > original > > experiment. As suggested, I called gc.collect() after each iteration. > It > > returned 67 for every iteration (no increase), although > len(gc.garbage) > > reported 0 each iteration. > > > > Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 > > times. Memory > > usage fluctuated over time, but essentially did not increase: > > starting at > > 32.54MB and ending at 32.79MB. gc.collect() reported 0 after each > > iteration > > as did len(gc.garbage). > > > > Attached are images of plots showing change in memory usage over > > time for each > > experiment. > > > > Any comments would be appreciated. > > > > Following is the code for each experiment. > > > > Agg > > ----- > > > > from random import random > > from datetime import datetime > > import os > > import gc > > import time > > import win32api > > import win32con > > import win32process > > > > import numpy > > > > import matplotlib > > matplotlib.use("Agg") > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_agg import FigureCanvasAgg as > > FigureCanvas > > > > def get_process_memory_info(process_id): > > memory = {} > > process = None > > try: > > process = win32api.OpenProcess( > > > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > > False, process_id); > > if process is not None: > > return win32process.GetProcessMemoryInfo(process) > > finally: > > if process: > > win32api.CloseHandle(process) > > return memory > > > > meg = 1024.0 * 1024.0 > > > > figure = Figure(dpi=None) > > canvas = FigureCanvas(figure) > > axes = figure.add_subplot(1,1,1) > > > > def draw(channel, seconds): > > axes.clear() > > axes.plot(channel, seconds) > > canvas.print_figure('test.png') > > > > channel = numpy.sin(numpy.arange(1000) * random()) > > seconds = numpy.arange(len(channel)) > > testDuration = 60 * 60 * 3 > > startTime = time.time() > > > > print "starting memory: ", \ > > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > > > while (time.time() - startTime) < testDuration: > > draw(channel, seconds) > > > > t = datetime.now() > > memory = get_process_memory_info(os.getpid()) > > print "time: {0}, working: {1:f}, collect: {2}, garbage: > > {3}".format( > > t, > > memory["WorkingSetSize"]/meg, > > gc.collect(), > > len(gc.garbage) ) > > > > time.sleep(0.5) > > > > > > TkAgg > > --------- > > from random import random > > from datetime import datetime > > import sys > > import os > > import gc > > import time > > import win32api > > import win32con > > import win32process > > > > import numpy > > > > import matplotlib > > matplotlib.use("TkAgg") > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \ > > as FigureCanvas > > > > import Tkinter as tk > > > > def get_process_memory_info(process_id): > > memory = {} > > process = None > > try: > > process = win32api.OpenProcess( > > > win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ, > > False, process_id); > > if process is not None: > > return win32process.GetProcessMemoryInfo(process) > > finally: > > if process: > > win32api.CloseHandle(process) > > return memory > > > > meg = 1024.0 * 1024.0 > > > > rootTk = tk.Tk() > > rootTk.wm_title("TKAgg Memory Leak") > > > > figure = Figure() > > canvas = FigureCanvas(figure, master=rootTk) > > axes = figure.add_subplot(1,1,1) > > > > def draw(channel, seconds): > > axes.clear() > > axes.plot(channel, seconds) > > > > channel = numpy.sin(numpy.arange(1000) * random()) > > seconds = numpy.arange(len(channel)) > > > > testDuration = 60 * 60 * 3 > > startTime = time.time() > > > > print "starting memory: ", \ > > get_process_memory_info(os.getpid())["WorkingSetSize"]/meg > > > > draw(channel, seconds) > > canvas.show() > > canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) > > > > rate = 500 > > > > def on_tick(): > > canvas.get_tk_widget().after(rate, on_tick) > > > > if (time.time() - startTime) >= testDuration: > > return > > > > draw(channel, seconds) > > > > t = datetime.now() > > memory = get_process_memory_info(os.getpid()) > > print "time: {0}, working: {1:f}, collect: {2}, garbage: > > {3}".format( > > t, > > memory["WorkingSetSize"]/meg, > > gc.collect(), > > len(gc.garbage) ) > > > > canvas.get_tk_widget().after(rate, on_tick) > > tk.mainloop() > > > > > > Interesting results. I would like to try these tests on a Linux machine > > to see if there is a difference, but I don't know what the equivalent > > functions would be to some of the win32 calls. Does anybody have a > > reference for such things? > > Do you need win32 calls, or do you just need to read the memory usage? > If the latter, see cbook.report_memory(). > > Eric > > > > > Ben Root > > I tried out the script using cbook.report_memory() with and without the patch. The patch certainly made the leak *much* slower. I am still finding a very slow leak at approximately 0.03226 MiB per 5 minutes in the resident set size. Ben Root |