From: Jon Olav Vik <jonovik@gm...>  20100316 11:35:20

I want to overlay many line plots using alpha transparency. However, plotting them in Matplotlib takes about O(n**2) time, and I think I may be running into memory limitations as well. As a simple benchmark, I used IPython to run alco.ipy (below), which runs alco.py for an increasing number of data series. Extrapolating from this, plotting 60000 series would take something like 200 minutes. This is similar to my actual use case, which takes about 3 hours to finish a plot. Zooming in and saving again is much faster, taking only about 30 seconds. I would appreciate suggestions on how to speed this up. For instance: Is there a memoryless "canvas" object that I could draw on, just accumulating the alpha in each pixel: new_alpha = old_alpha + (1  old_alpha) * this_alpha. Failing that, I could do it manually by keeping a Numpy array of the pixels in the image. For each series, find the x values corresponding to each column index, then interpolate to find the row index corresponding to each y value. Finally, use imshow() or something to add axes and annotation. That you in advance for any help. Best regards, Jon Olav == Output of alco.ipy == The columns are "number of series" and "seconds". In [8]: run alco.ipy 1000 9.07 2000 24.8 3000 44.73 4000 67.85 5000 95.67 6000 135.1 7000 177.82 8000 226.03 9000 278.32 10000 340.81 == alco.ipy == n, t = [], [] for i in range(1000, 10001, 1000): n.append(i) ti = !python alco.py $i t.append(float(ti.s)) print n[1], t[1] plot(n, t, '.') == alco.py == """Alpha compositing of line plots. Usage: python alco.py NSERIES ALPHA""" from sys import argv import numpy as np import matplotlib as mpl mpl.use("agg") # noninteractive plotting from pylab import * n = int(argv[1]) try: alpha = float(argv[2]) except IndexError: alpha = 0.02 # generate some data x = np.arange(200) for i in range(n): y = np.sin(x / (2 * np.pi * x[1] * i)) plot(x, y, 'k', alpha=alpha) savefig("test.png") 
From: JaeJoon Lee <lee.joon@gm...>  20100316 12:26:13

If you're plotting lots of lines, do not use plot but use LineCollection instead. http://matplotlib.sourceforge.net/examples/api/collections_demo.html http://matplotlib.sourceforge.net/api/collections_api.html#matplotlib.collections.LineCollection Here is slightly modified version of your code that uses LineCollection (but I haven't check if the code is correct). With my not so good macbook, it took me 3 sec for 6000 lines and it seems like O(n) to me. Regards, JJ ax = subplot(111) x = np.arange(200) yy = [np.array((x, np.sin(x / (2 * np.pi * x[1] * i))))) for i in range(n)] yyt = [np.transpose(y1) for y1 in yy] from matplotlib.collections import LineCollection lc = LineCollection(yyt, colors=[(0, 0, 0, alpha)]) ax.add_collection(lc) ax.autoscale_view() On Tue, Mar 16, 2010 at 7:26 AM, Jon Olav Vik <jonovik@...> wrote: > I want to overlay many line plots using alpha transparency. However, plotting > them in Matplotlib takes about O(n**2) time, and I think I may be running into > memory limitations as well. > > As a simple benchmark, I used IPython to run alco.ipy (below), which runs > alco.py for an increasing number of data series. Extrapolating from this, > plotting 60000 series would take something like 200 minutes. This is similar to > my actual use case, which takes about 3 hours to finish a plot. Zooming in and > saving again is much faster, taking only about 30 seconds. > > I would appreciate suggestions on how to speed this up. For instance: > > Is there a memoryless "canvas" object that I could draw on, just accumulating > the alpha in each pixel: new_alpha = old_alpha + (1  old_alpha) * this_alpha. > > Failing that, I could do it manually by keeping a Numpy array of the pixels in > the image. For each series, find the x values corresponding to each column > index, then interpolate to find the row index corresponding to each y value. > Finally, use imshow() or something to add axes and annotation. > > That you in advance for any help. > > Best regards, > Jon Olav > > == Output of alco.ipy == > > The columns are "number of series" and "seconds". > > In [8]: run alco.ipy > 1000 9.07 > 2000 24.8 > 3000 44.73 > 4000 67.85 > 5000 95.67 > 6000 135.1 > 7000 177.82 > 8000 226.03 > 9000 278.32 > 10000 340.81 > > == alco.ipy == > > n, t = [], [] > for i in range(1000, 10001, 1000): > n.append(i) > ti = !python alco.py $i > t.append(float(ti.s)) > print n[1], t[1] > > plot(n, t, '.') > > == alco.py == > > """Alpha compositing of line plots. Usage: python alco.py NSERIES ALPHA""" > from sys import argv > import numpy as np > import matplotlib as mpl > mpl.use("agg") # noninteractive plotting > from pylab import * > > n = int(argv[1]) > try: > alpha = float(argv[2]) > except IndexError: > alpha = 0.02 > > # generate some data > x = np.arange(200) > for i in range(n): > y = np.sin(x / (2 * np.pi * x[1] * i)) > plot(x, y, 'k', alpha=alpha) > > savefig("test.png") > > > >  > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and finetune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intelswdev > _______________________________________________ > Matplotlibusers mailing list > Matplotlibusers@... > https://lists.sourceforge.net/lists/listinfo/matplotlibusers > 
From: Jon Olav Vik <jonovik@gm...>  20100316 13:47:18

JaeJoon Lee <lee.j.joon@...> writes: > If you're plotting lots of lines, do not use plot but use > LineCollection instead. > > http://matplotlib.sourceforge.net/examples/api/collections_demo.html > > http://matplotlib.sourceforge.net/api/ collections_api.html#matplotlib.collections.LineCollection > > Here is slightly modified version of your code that uses > LineCollection (but I haven't check if the code is correct). > With my not so good macbook, it took me 3 sec for 6000 lines and it > seems like O(n) to me. Thank you, thank you, thank you. This is just as convenient, 50% faster even for 1000 series, and runtime does indeed scale as O(n) up to 10000 series. The projected speedup for 60000 series was 40x. However, in my actual use case it was at least 400x: Finishing in 2 min 17 sec rather than not getting past halfway in 16 hours. (The extra difference is probably due to better memory usage. Still, LineCollection requires O(n) memory, whereas manually updating a bitmap would only use O(1) memory, where 1 = size of bitmap. However, I hope I never have to do that...) May the hours and hours you have saved me be added to your life! 8) Jon Olav 
From: John Hunter <jdh2358@gm...>  20100316 15:01:54

On Tue, Mar 16, 2010 at 8:46 AM, Jon Olav Vik <jonovik@...> wrote: > Thank you, thank you, thank you. > > This is just as convenient, 50% faster even for 1000 series, and runtime does > indeed scale as O(n) up to 10000 series. The projected speedup for 60000 series > was 40x. However, in my actual use case it was at least 400x: Finishing in 2 > min 17 sec rather than not getting past halfway in 16 hours. > > (The extra difference is probably due to better memory usage. Still, > LineCollection requires O(n) memory, whereas manually updating a bitmap would > only use O(1) memory, where 1 = size of bitmap. However, I hope I never have to > do that...) > > May the hours and hours you have saved me be added to your life! 8) Since you are granting extra life blessings, I thought I should add something to the mix. You should be able to achieve something close to this using the animation blit API. There is a little hackery at the end to use the renderer to directly dump a PNG and thereby circumvent the normal figure.canvas.draw pipeline, but the advantage is you render directly to the canvas and save no intermediaries. See the examples and tutorial at http://matplotlib.sourceforge.net/examples/animation/index.html http://www.scipy.org/Cookbook/Matplotlib/Animations Here's some example code:: import matplotlib._png as _png import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt import numpy as np fig = plt.figure() ax = fig.add_subplot(111) n = 10000 line, = ax.plot([],[], alpha=1) x = np.arange(200) fig.canvas.draw() ax.axis([0, 200, 1, 1]) for i in range(n): if (i%100)==0: print i yy = np.sin(x / (2 * np.pi * x[1] * i)) line.set_data(x, yy) ax.draw_artist(line) fig.canvas.blit(ax.bbox) filename = 'test.png' renderer = fig.canvas.get_renderer() _png.write_png(renderer._renderer.buffer_rgba(0, 0), renderer.width, renderer.height, filename, fig.dpi) JDH 