|
From: Raffaele Q. <raf...@li...> - 2014-09-22 10:43:47
|
Hi all, somebody can show me with an example how can I set the numpy's broadcasting feature? Actually, I'm using 'meshgrid' in the script but I knew that it takes a lot of time to have the plot. Thank you. Raf -----Original Message----- From: Raffaele Quarta [mailto:raf...@li...] Sent: Tue 9/9/2014 3:55 PM To: Benjamin Root; Ryan Nelson Cc: Matplotlib Users Subject: Re: [Matplotlib-users] Plotting large file (NetCDF) Hi Ben and Ryan, I will try to figure out as it works. Thank you. Regards, Raf -----Original Message----- From: ben...@gm... on behalf of Benjamin Root Sent: Tue 9/9/2014 3:25 PM To: Ryan Nelson Cc: Raffaele Quarta; Matplotlib Users Subject: Re: [Matplotlib-users] Plotting large file (NetCDF) Most of the time, you will not need to use meshgrid. Take advantage of numpy's broadcasting feature: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html It saves *significantly* on memory and processing time. Most of Matplotlib's plotting functions work well with broadcastable inputs, so that is a great way to save on memory. NumPy's ogrid is also a neat tool for generating broadcastable grids. When I get a chance, I'll look through the script for any other obvious savers. Cheers! Ben Root On Tue, Sep 9, 2014 at 9:02 AM, Ryan Nelson <rne...@gm...> wrote: > Raffaele, > > As Ben pointed out, you might be creating a lot of in memory Numpy arrays > that you probably don't need/want. > > For example, I think (?) slicing all of the variable below: > lons = fh.variables['lon'][:] > is making a copy of all that (mmap'ed) data as a Numpy array in memory. > Get rid of the slice ([:]). Of course, these variables are not Numpy > arrays, so you'll have to change some of your code. For example: > lon_0 = lons.mean() > Will have to become: > lon_0 = np.mean( lons ) > > If lats and lons are very large sets of data, then meshgrid will make two > very, very large arrays in memory. > For example, try this: > np.meshgrid(np.arange(5), np.arange(5)) > The output is two much larger arrays: > [array([[0, 1, 2, 3, 4], > [0, 1, 2, 3, 4], > [0, 1, 2, 3, 4], > [0, 1, 2, 3, 4], > [0, 1, 2, 3, 4]]), > array([[0, 0, 0, 0, 0], > [1, 1, 1, 1, 1], > [2, 2, 2, 2, 2], > [3, 3, 3, 3, 3], > [4, 4, 4, 4, 4]])] > I don't know Basemap at all, so I don't know if this is necessary. You > might be able to force the meshgrid output into a memmap file, but I don't > know how to do that right now. Perhaps someone else has some suggestions. > > Hope that helps. > > Ryan > > > > > On Tue, Sep 9, 2014 at 4:07 AM, Raffaele Quarta < > raf...@li...> wrote: > >> Hi Jody and Ben, >> >> thanks for your answers. >> I tried to use pcolormesh instead of pcolor and the result is very good! >> For what concern with the memory system problem, I wasn't able to solve it. >> When I tried to use the bigger file, I got the same problem. Attached you >> will find the script that I'm using to make the plot. May be, I didn't >> understand very well how can I use the mmap function. >> >> Regards, >> >> Raffaele. >> >> >> -----Original Message----- >> From: Jody Klymak [mailto:jk...@uv... <jk...@uv...>] >> Sent: Mon 9/8/2014 5:46 PM >> To: Benjamin Root >> Cc: Raffaele Quarta; Matplotlib Users >> Subject: Re: [Matplotlib-users] Plotting large file (NetCDF) >> >> It looks like you are calling `pcolor`. Can I suggest you try >> `pcolormesh`? ii >> >> 75 Mb is not a big file! >> >> Cheers, Jody >> >> >> On Sep 8, 2014, at 7:38 AM, Benjamin Root <ben...@ou...> wrote: >> >> > (Keeping this on the mailing list so that others can benefit) >> > >> > What might be happening is that you are keeping around too many numpy >> arrays in memory than you actually need. Take advantage of memmapping, >> which most netcdf tools provide by default. This keeps the data on disk >> rather than in RAM. Second, for very large images, I would suggest either >> pcolormesh() or just simply imshow() instead of pcolor() as they are more >> way more efficient than pcolor(). In addition, it sounds like you are >> dealing with re-sampled data ("at different zoom levels"). Does this mean >> that you are re-running contour on re-sampled data? I am not sure what the >> benefit of doing that is if one could just simply do the contour once at >> the highest resolution. >> > >> > Without seeing any code, though, I can only provide generic suggestions. >> > >> > Cheers! >> > Ben Root >> > >> > >> > On Mon, Sep 8, 2014 at 10:12 AM, Raffaele Quarta < >> raf...@li...> wrote: >> > Hi Ben, >> > >> > sorry for the few details that I gave to you. I'm trying to make a >> contour plot of a variable at different zoom levels by using high >> resolution data. The aim is to obtain .PNG output images. Actually, I'm >> working with big data (NetCDF file, dimension is about 75Mb). The current >> Matplotlib version on my UBUNTU 14.04 machine is the 1.3.1 one. My system >> has a RAM capacity of 8Gb. >> > Actually, I'm dealing with memory system problems when I try to make a >> plot. I got the error message as follow: >> > >> > -------------------------------------------- >> > cs = m.pcolor(xi,yi,np.squeeze(t)) >> > File "/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/__init__.py", >> line 521, in with_transform >> > return plotfunc(self,x,y,data,*args,**kwargs) >> > File "/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/__init__.py", >> line 3375, in pcolor >> > x = ma.masked_values(np.where(x > 1.e20,1.e20,x), 1.e20) >> > File "/usr/lib/python2.7/dist-packages/numpy/ma/core.py", line 2195, >> in masked_values >> > condition = umath.less_equal(mabs(xnew - value), atol + rtol * >> mabs(value)) >> > MemoryError >> > -------------------------------------------- >> > >> > Otherwise, when I try to make a plot of smaller file (such as 5Mb), it >> works very well. I believe that it's not something of wrong in the script. >> It might be a memory system problem. >> > I hope that my message is more clear now. >> > >> > Thanks for the help. >> > >> > Regards, >> > >> > Raffaele >> > >> > ----------------------------------------- >> > >> > Sent: Mon 9/8/2014 3:19 PM >> > To: Raffaele Quarta >> > Cc: Matplotlib Users >> > Subject: Re: [Matplotlib-users] Plotting large file (NetCDF) >> > >> > >> > >> > You will need to be more specific... much more specific. What kind of >> plot >> > are you making? How big is your data? What version of matplotlib are you >> > using? How much RAM do you have available compared to the amount of data >> > (most slowdowns are actually due to swap-thrashing issues). Matplotlib >> can >> > be used for large data, but there exists some speciality tools for the >> > truly large datasets. The solution depends on the situation. >> > >> > Ben Root >> > >> > On Mon, Sep 8, 2014 at 7:45 AM, Raffaele Quarta < >> raf...@li...> >> > wrote: >> > >> > > Hi, >> > > >> > > I'm working with NetCDF format. When I try to make a plot of very >> large >> > > file, I have to wait for a long time for plotting. How can I solve >> this? >> > > Isn't there a solution for this problem? >> > > >> > > Raffaele >> > > >> > > -- >> > > This email was Virus checked by Astaro Security Gateway. >> http://www.sophos.com >> > > >> > > >> > > >> > > >> ------------------------------------------------------------------------------ >> > > Want excitement? >> > > Manually upgrade your production database. >> > > When you want reliability, choose Perforce >> > > Perforce version control. Predictably reliable. >> > > >> > > >> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk >> > > _______________________________________________ >> > > Matplotlib-users mailing list >> > > Mat...@li... >> > > https://lists.sourceforge.net/lists/listinfo/matplotlib-users >> > > >> > > >> > >> > -- >> > This email was Virus checked by Astaro Security Gateway. >> http://www.sophos.com >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > Want excitement? >> > Manually upgrade your production database. >> > When you want reliability, choose Perforce >> > Perforce version control. Predictably reliable. >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk_______________________________________________ >> > Matplotlib-users mailing list >> > Mat...@li... >> > https://lists.sourceforge.net/lists/listinfo/matplotlib-users >> >> -- >> Jody Klymak >> http://web.uvic.ca/~jklymak/ >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Want excitement? >> Manually upgrade your production database. >> When you want reliability, choose Perforce. >> Perforce version control. Predictably reliable. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk >> _______________________________________________ >> Matplotlib-users mailing list >> Mat...@li... >> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >> >> > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce. > Perforce version control. Predictably reliable. > > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > > -- This email was Virus checked by Astaro Security Gateway. http://www.sophos.com |