From: Jimmy P. <jim...@gm...> - 2009-06-17 14:56:13
|
2009/6/17 Michael Droettboom <md...@st...> > vehemental wrote: > >> Hello, >> >> I'm using matplotlib for various tasks beautifully...but on some >> occasions, >> I have to visualize large datasets (in the range of 10M data points) >> (using >> imshow or regular plots)...system start to choke a bit at that point... >> >> > The first thing I would check is whether your system becomes starved for > memory at this point and virtual memory swapping kicks in. the python process is sitting around a 300Mo of memory comsumption....there should plenty of memory left... but I will look more closely to what's happenning... I would assume the Memory bandwidth to not be very high, given the cheapness of the comp i' m using :D > > > A common technique for faster plotting of image data is to downsample it > before passing it to matplotlib. Same with line plots -- they can be > decimated. There is newer/faster path simplification code in SVN trunk that > may help with complex line plots (when the path.simplify rcParam is True). > I would suggest starting with that as a baseline to see how much > performance it already gives over the released version. yes totally make sense...no need to visualize 3 millions points if you can only display 200 000.... I'm already doing that to some extent, but it's taking time on its own...but at least I have solutions to reduce this time if needed.... i' ll try the SVN version....see if I can extract some improvements.... > > I would like to be consistent somehow and not use different tools for >> basically similar tasks... >> so I'd like some pointers regarding rendering performance...as I would be >> interested to be involved in dev is there is something to be done.... >> >> To active developers, what's the general feel does matplotlib have room to >> spare in its rendering performance?... >> >> > I've spent a lot of time optimizing the Agg backend (which is already one > of the fastest software-only approaches out there), and I'm out of obvious > ideas. But a fresh set of eyes may find new things. An advantage of Agg > that shouldn't be overlooked is that is works identically everywhere. > >> or is it pretty tied down to the speed of Agg right now? >> Is there something to gain from using the multiprocessing module now >> included by default in 2.6? >> >> > Probably not. If the work of rendering were to be divided among cores, > that would probably be done at the C++ level anyway to see any gains. As it > is, the problem with plotting many points generally tends to be limited by > memory bandwidth anyway, not processor speed. > >> or even go as far as using something like pyGPU for fast vectorized >> computations...? >> >> > Perhaps. But again, the computation isn't the bottleneck -- it's usually a > memory bandwidth starvation issue in my experience. Using a GPU may only > make matters worse. Note that I consider that approach distinct from just > using OpenGL to colormap and render the image as a texture. That approach > may bear some fruit -- but only for image plots. Vector graphics > acceleration with GPUs is still difficult to do in high quality across > platforms and chipsets and beat software for speed. > So if I hear you correctly, the Matplotlib/Agg combination is not terribly slower that would be a C plotting lib using Agg as well to render... and we are talking more about hardware limitations, right? > > I've seen around previous discussions about OpenGL being a backend in some >> future... >> would it really stand up compared to the current backends? is there clues >> about that right now? >> > Thanks Nicolas, I' ll take a closer look at GLnumpy.... I can probably gather some info by making a comparison of an imshow to the equivalent in OGL.... > >> thanks for any inputs! :D >> bye >> >> > Hope this helps, it did! thanks jimmy > > Mike > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > |