From: Robert Cimrman <cimrman3@nt...>  20090713 18:39:18
Attachments:
times.png

Hi all, I would like to use griddata() to interpolate a function given at specified points of a bunch of other points. While the method works well, it slows down considerably as the number of points to interpolate to increases. The dependence of time/(number of points) is nonlinear (see the attachment)  it seems that while the Delaunay trinagulation itself is fast, I wonder how to speedup the interpolation. The docstring says, that it is based on "natural neighbor interpolation"  how are the neighbors searched? Does it use the kdtrees like scipy.spatial? I have a very good experience with scipy.spatial performance. Also, is there a way of reusing the triangulation when interpolating several times using the same grid? cheers, r. ps: no natgrid In [9]: mpl.__version__ Out[9]: '0.98.5.3' 
From: Robert Cimrman <cimrman3@nt...>  20090713 18:39:18
Attachments:
times.png

Hi all, I would like to use griddata() to interpolate a function given at specified points of a bunch of other points. While the method works well, it slows down considerably as the number of points to interpolate to increases. The dependence of time/(number of points) is nonlinear (see the attachment)  it seems that while the Delaunay trinagulation itself is fast, I wonder how to speedup the interpolation. The docstring says, that it is based on "natural neighbor interpolation"  how are the neighbors searched? Does it use the kdtrees like scipy.spatial? I have a very good experience with scipy.spatial performance. Also, is there a way of reusing the triangulation when interpolating several times using the same grid? cheers, r. ps: no natgrid In [9]: mpl.__version__ Out[9]: '0.98.5.3' 
From: Jeff Whitaker <jswhit@fa...>  20090713 20:11:52

Robert Cimrman wrote: > Hi all, > > I would like to use griddata() to interpolate a function given at > specified points of a bunch of other points. While the method works > well, it slows down considerably as the number of points to > interpolate to increases. > > The dependence of time/(number of points) is nonlinear (see the > attachment)  it seems that while the Delaunay trinagulation itself is > fast, I wonder how to speedup the interpolation. The docstring says, > that it is based on "natural neighbor interpolation"  how are the > neighbors searched? Does it use the kdtrees like scipy.spatial? I > have a very good experience with scipy.spatial performance. > > Also, is there a way of reusing the triangulation when interpolating > several times using the same grid? > > cheers, > r. Robert: The griddata function uses the delaunay module, which is a straight copy of Robert Kern's delaunay scikit. No one here is that familiar with the internals of delaunay, so I'd suggest you either ask Robert, or dig into the source code yourself (which is here: http://scipy.org/scipy/scikits/browser/trunk/delaunay). Jeff  Jeffrey S. Whitaker Phone : (303)4976313 Meteorologist FAX : (303)4976449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@... 325 Broadway Office : Skaggs Research Cntr 1D113 Boulder, CO, USA 803033328 Web : http://tinyurl.com/5telg 
From: Robert Kern <robert.kern@gm...>  20090714 17:18:26

On 20090713 13:20, Robert Cimrman wrote: > Hi all, > > I would like to use griddata() to interpolate a function given at > specified points of a bunch of other points. While the method works > well, it slows down considerably as the number of points to interpolate > to increases. > > The dependence of time/(number of points) is nonlinear (see the > attachment)  it seems that while the Delaunay trinagulation itself is > fast, I wonder how to speedup the interpolation. The docstring says, > that it is based on "natural neighbor interpolation"  how are the > neighbors searched? Using the Delaunay triangulation. The "natural neighbors" of an interpolation point are those points participating in triangles in the Delaunay triangulation whose circumcircles include the interpolation point. The triangle that encloses the interpolation point is found by a standard walking procedure, then the neighboring triangles (natural or otherwise) are explored in a breadthfirst search around the starting triangle to find the natural neighbors. Unfortunately, griddata() uses the unstructuredinterpolationpoints API rather than the more efficient gridinterpolationpoints API. In the former, each interpolation point uses the lastfound enclosing triangle as the start of the walking search. This works well where adjacent interpolation points are close to each other. This is not the case at the ends of the grid rows. The latter API is smarter and starts a new row of the grid with the triangle from the triangle from the *start* of the previous row rather than the end. I suspect this is largely the cause of the poor performance. > Does it use the kdtrees like scipy.spatial? I have > a very good experience with scipy.spatial performance. > > Also, is there a way of reusing the triangulation when interpolating > several times using the same grid? One would construct a Triangulation() object with the (x,y) data points, get a new NNInterpolator() object using the .nn_interpolator(z) method for each new z data set, and then interpolate your grid on the NNInterpolator.  Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco 
From: Robert Cimrman <cimrman3@nt...>  20090714 17:54:38

Robert Kern wrote: > On 20090713 13:20, Robert Cimrman wrote: >> Hi all, >> >> I would like to use griddata() to interpolate a function given at >> specified points of a bunch of other points. While the method works >> well, it slows down considerably as the number of points to interpolate >> to increases. >> >> The dependence of time/(number of points) is nonlinear (see the >> attachment)  it seems that while the Delaunay trinagulation itself is >> fast, I wonder how to speedup the interpolation. The docstring says, >> that it is based on "natural neighbor interpolation"  how are the >> neighbors searched? > > Using the Delaunay triangulation. The "natural neighbors" of an interpolation > point are those points participating in triangles in the Delaunay triangulation > whose circumcircles include the interpolation point. The triangle that encloses > the interpolation point is found by a standard walking procedure, then the > neighboring triangles (natural or otherwise) are explored in a breadthfirst > search around the starting triangle to find the natural neighbors. I see, thanks for the explanation. The walking procedure is what is described e.g. in [1], right? (summary; starting from a random triangle, a line is made connecting that triangle with the interpolation point, and triangles along that line are probed.) [1] http://www.geom.uiuc.edu/software/cglist/GeomDir/ptloc96.ps.gz > Unfortunately, griddata() uses the unstructuredinterpolationpoints API rather > than the more efficient gridinterpolationpoints API. In the former, each > interpolation point uses the lastfound enclosing triangle as the start of the > walking search. This works well where adjacent interpolation points are close to > each other. This is not the case at the ends of the grid rows. The latter API is > smarter and starts a new row of the grid with the triangle from the triangle > from the *start* of the previous row rather than the end. I suspect this is > largely the cause of the poor performance. Good to know, I will try to pass the points in groups of close points. >> Does it use the kdtrees like scipy.spatial? I have >> a very good experience with scipy.spatial performance. >> >> Also, is there a way of reusing the triangulation when interpolating >> several times using the same grid? > > One would construct a Triangulation() object with the (x,y) data points, get a > new NNInterpolator() object using the .nn_interpolator(z) method for each new z > data set, and then interpolate your grid on the NNInterpolator. So if the above fails, I can bypass griddata() by using the delaunay module directly, good. thank you, r. 
From: Robert Kern <robert.kern@gm...>  20090714 18:48:42

On 20090714 12:52, Robert Cimrman wrote: > Robert Kern wrote: >> On 20090713 13:20, Robert Cimrman wrote: >>> Hi all, >>> >>> I would like to use griddata() to interpolate a function given at >>> specified points of a bunch of other points. While the method works >>> well, it slows down considerably as the number of points to interpolate >>> to increases. >>> >>> The dependence of time/(number of points) is nonlinear (see the >>> attachment)  it seems that while the Delaunay trinagulation itself is >>> fast, I wonder how to speedup the interpolation. The docstring says, >>> that it is based on "natural neighbor interpolation"  how are the >>> neighbors searched? >> Using the Delaunay triangulation. The "natural neighbors" of an interpolation >> point are those points participating in triangles in the Delaunay triangulation >> whose circumcircles include the interpolation point. The triangle that encloses >> the interpolation point is found by a standard walking procedure, then the >> neighboring triangles (natural or otherwise) are explored in a breadthfirst >> search around the starting triangle to find the natural neighbors. > > I see, thanks for the explanation. The walking procedure is what is > described e.g. in [1], right? (summary; starting from a random triangle, > a line is made connecting that triangle with the interpolation point, > and triangles along that line are probed.) > > [1] http://www.geom.uiuc.edu/software/cglist/GeomDir/ptloc96.ps.gz Yes. >>> Does it use the kdtrees like scipy.spatial? I have >>> a very good experience with scipy.spatial performance. >>> >>> Also, is there a way of reusing the triangulation when interpolating >>> several times using the same grid? >> One would construct a Triangulation() object with the (x,y) data points, get a >> new NNInterpolator() object using the .nn_interpolator(z) method for each new z >> data set, and then interpolate your grid on the NNInterpolator. > > So if the above fails, I can bypass griddata() by using the delaunay > module directly, good. Yes. griddata is a fairly light wrapper that exists mainly to sanitize inputs and allow use of the natgrid implementation easily.  Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco 
From: DenisB <denisbzpy@t...>  20090721 16:32:20

Robert C, Robert K, folks, messing with the nice delaunay/testfuncs.py to time linear_interpolate_grid nn_interpolate_grid and nn_interpolate_unstructured in _delaunay, I see linear ~ 100 times faster than the nn_ s: # from: trigrid Ntri=1000 Ngrid=100 run: 21 Jul 2009 17:33 mac 10.4.11 ppc time: 0.027 sec trigrid: build Triangulation 1000 time: 0.0059 sec trigrid 100 "linear" corners: 0 1 2 1 time: 0.5 sec trigrid 100 "nn_grid" corners: 0 1 2 1 time: 0.49 sec trigrid 100 "nn_unstruct" corners: 0 1 2 1 Correct me: if all 3 methods do gridpointtotriangle in the same way, then the huge diff is in findneighboringtriangles (6 on average ?), not in gridpointtotriangle ? This is with the _delaunay.so that comes with the mac 98.5.3 egg, however that was compiled (O3 ?) What to do ? 1) does it matter, how many people care ? (all who believe in telekinesis, raise my right hand) 2) natgrid ? don't see it in matplotlib.sf.net 3) stick with fast linear, smooth the triangle planes a la 3t^2  2t^3 or fancier smoothing In any case, add griddata( ... method = "linear" / "nn" ... ) so users have a choice. Can a real user or two tell us about the flow, with some rough numbers for Ntri Ngrid Npix  Ntri = nr original sample points, say 1000 Ngrid 100 x 100 Npix 800 x 600 ? (Ntri > Ngrid slowly and accurately, then Ngrid > Npix w fast inaccurate image interpolation ? hmm.) cheers  denis  View this message in context: http://www.nabble.com/speedingupgriddata%28%29tp24467055p24591133.html Sent from the matplotlib  users mailing list archive at Nabble.com. 
From: Jeff Whitaker <jswhit@fa...>  20090723 12:15:41

DenisB wrote: > Robert C, Robert K, folks, > > messing with the nice delaunay/testfuncs.py to time > linear_interpolate_grid nn_interpolate_grid and nn_interpolate_unstructured > in _delaunay, I see linear ~ 100 times faster than the nn_ s: > > # from: trigrid Ntri=1000 Ngrid=100 run: 21 Jul 2009 17:33 mac 10.4.11 ppc > > time: 0.027 sec trigrid: build Triangulation 1000 > time: 0.0059 sec trigrid 100 "linear" corners: 0 1 2 1 > time: 0.5 sec trigrid 100 "nn_grid" corners: 0 1 2 1 > time: 0.49 sec trigrid 100 "nn_unstruct" corners: 0 1 2 1 > > Correct me: if all 3 methods do gridpointtotriangle in the same way, > then the huge diff is in findneighboringtriangles (6 on average ?), not in > gridpointtotriangle ? > > This is with the _delaunay.so that comes with the mac 98.5.3 egg, > however that was compiled (O3 ?) > > > What to do ? > > 1) does it matter, how many people care ? (all who believe in telekinesis, > raise my right hand) > > 2) natgrid ? don't see it in matplotlib.sf.net > > 3) stick with fast linear, smooth the triangle planes a la 3t^2  2t^3 or > fancier smoothing > > In any case, add griddata( ... method = "linear" / "nn" ... ) so users have > a choice. > Denis: I have added an 'interp' keyword to griddata (svn revision 7287) so you can choose the faster linear interpolation with interp='linear'. However, for linear interpolation the output grid is assumed to be regular with constant dx and dy (and this is *not* checked for by griddata). The natgrid toolkit is on the sf site under 'Files', then 'matplotlibtoolkits'. It is in fact slower than the Delaunay package included in matplotlib, but a little bit more robust for pathological cases. Jeff > Can a real user or two tell us about the flow, > with some rough numbers for Ntri Ngrid Npix  > Ntri = nr original sample points, say 1000 > Ngrid 100 x 100 > Npix 800 x 600 ? > (Ntri > Ngrid slowly and accurately, > then Ngrid > Npix w fast inaccurate image interpolation ? hmm.) > > cheers >  denis > > 
From: DenisB <denisbzpy@t...>  20090724 15:29:41

Jeff Whitaker wrote: > > > Denis: I have added an 'interp' keyword to griddata (svn revision 7287) > so you can choose the faster linear interpolation with interp='linear'. > > Thanks Jeff, that was quick. Do you also see linear waaaay faster than NN, factor 100 ?! (Fwiw, a quick run of the Mac Shark profiler shows lots of time in NaturalNeighbors::interpolate_one which uses stdlib stacks heavily  overkill for tiny stacks.) Did my last question on Ntri > Ngrid > Npix make any sense at all ? It would be nice if one could go straight from a triangulation to pixels  will ask AGG. cheers  denis  View this message in context: http://www.nabble.com/speedingupgriddata%28%29tp24467055p24646383.html Sent from the matplotlib  users mailing list archive at Nabble.com. 
From: Jeff Whitaker <jswhit@fa...>  20090724 16:04:21

DenisB wrote: > > Jeff Whitaker wrote: > >> Denis: I have added an 'interp' keyword to griddata (svn revision 7287) >> so you can choose the faster linear interpolation with interp='linear'. >> >> >> > > Thanks Jeff, > that was quick. Do you also see linear waaaay faster than NN, factor 100 > ?! > (Fwiw, a quick run of the Mac Shark profiler shows lots of time in > NaturalNeighbors::interpolate_one > which uses stdlib stacks heavily  overkill for tiny stacks.) > Denis: I see more like a factor of 3 or 4, not 100. > Did my last question on Ntri > Ngrid > Npix make any sense at all ? > Not really, but I haven't had the time to think about it very hard. Jeff > It would be nice if one could go straight from a triangulation to pixels  > will ask AGG. > > cheers >  denis > > >  Jeffrey S. Whitaker Phone : (303)4976313 Meteorologist FAX : (303)4976449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@... 325 Broadway Office : Skaggs Research Cntr 1D113 Boulder, CO, USA 803033328 Web : http://tinyurl.com/5telg 
From: DenisB <denisbzpy@t...>  20090727 11:33:44

<stack> in interpolate_one might be the time hog; I see claims that plain <vector> is faster. StackVector<int, 128> in http://stackoverflow.com/questions/354442/lookingforcstllikevectorclassbutusingstackstorage looks nice, but I haven't timed it myself. cheers  denis  View this message in context: http://www.nabble.com/speedingupgriddata%28%29tp24467055p24679147.html Sent from the matplotlib  users mailing list archive at Nabble.com. 