Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project!

## matplotlib-users

 [Matplotlib-users] contribution offer: griddata with gaussian average From: Václav Šmilauer - 2009-10-04 15:11:39 ```Hi, about a year ago I developed for my own purposes a routine for averaging irregularly-sampled data using gaussian average. I would like to contribute it to matplotlib, after a clean-up, if there is interest. First it stores all data points in a (regular) grid. One can ask for value at arbitrary point within the grid range: based on stDev parameter, all data points within some distance (user-settable) are weighted using the normal distribution function and then the average is returned. Storing data in grid avoids searching all data points every time, only those within certain grid-range will be used. It would be relatively easy to extend to another distribution types (lines 120-130), such as elliptic-normal distribution, with different stDev in each direction. The code is here: http://bazaar.launchpad.net/~yade-dev/yade/trunk/annotate/head% 3A/lib/smoothing/WeightedAverage2d.hpp I should be able to convert it to accept numpy arrays, wrap using Python API instead of boost::python and write some documentation. Let me know if matplotlib-devel list would be more appropriate for this discussion. Cheers, Vaclav ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Christopher Barker - 2009-10-04 20:27:42 ```Václav Šmilauer wrote: > about a year ago I developed for my own purposes a routine for averaging > irregularly-sampled data using gaussian average. is this similar to Kernel Density estimation? http://www.scipy.org/doc/api_docs/SciPy.stats.kde.gaussian_kde.html In any case, it sounds more like an appropriate contribution to scipy than MPL. > First it stores all data points in a (regular) grid. What if your data points are not on a regular grid? Wouldn't this be a way to interpolate onto such a grid? > Storing data in grid avoids searching all data points every > time, only those within certain grid-range will be used. another option would be to store the points in some sort of spatial index, like an r-tree. > I should be able to convert it to accept numpy arrays, key. > wrap using Python > API instead of boost::python I'd consider Cython for that. sounds like a useful module, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@... ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Robert Kern - 2009-10-04 21:25:20 ```On 2009-10-04 15:27 PM, Christopher Barker wrote: > Václav Šmilauer wrote: > >> about a year ago I developed for my own purposes a routine for averaging >> irregularly-sampled data using gaussian average. > > is this similar to Kernel Density estimation? > > http://www.scipy.org/doc/api_docs/SciPy.stats.kde.gaussian_kde.html No. It is probably closer to radial basis function interpolation (in fact, it almost certainly is a form of RBFs): http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html#id1 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Ryan May - 2009-10-05 02:18:35 ```On Sun, Oct 4, 2009 at 4:21 PM, Robert Kern wrote: > On 2009-10-04 15:27 PM, Christopher Barker wrote: >> Václav Šmilauer wrote: >> >>> about a year ago I developed for my own purposes a routine for averaging >>> irregularly-sampled data using gaussian average. >> >> is this similar to Kernel Density estimation? >> >> http://www.scipy.org/doc/api_docs/SciPy.stats.kde.gaussian_kde.html > > No. It is probably closer to radial basis function interpolation (in fact, it > almost certainly is a form of RBFs): > > http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html#id1 Except in radial basis function interpolation, you solve for the weights that give the original values at the original data points. Here, it's just a inverse-distance weighted average, where the weights are chosen using an exp(-x^2/A) relation. There's a huge difference between the two when you're dealing with data with noise. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Robert Kern - 2009-10-05 02:20:53 ```Ryan May wrote: > On Sun, Oct 4, 2009 at 4:21 PM, Robert Kern wrote: >> On 2009-10-04 15:27 PM, Christopher Barker wrote: >>> Václav Šmilauer wrote: >>> >>>> about a year ago I developed for my own purposes a routine for averaging >>>> irregularly-sampled data using gaussian average. >>> is this similar to Kernel Density estimation? >>> >>> http://www.scipy.org/doc/api_docs/SciPy.stats.kde.gaussian_kde.html >> No. It is probably closer to radial basis function interpolation (in fact, it >> almost certainly is a form of RBFs): >> >> http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html#id1 > > Except in radial basis function interpolation, you solve for the > weights that give the original values at the original data points. > Here, it's just a inverse-distance weighted average, where the weights > are chosen using an exp(-x^2/A) relation. There's a huge difference > between the two when you're dealing with data with noise. Fair point. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Christopher Barker - 2009-10-05 21:26:22 ```VáclavŠmilauer wrote: > I checked the official terminology, it is a kernel average smoother > (in the sense of [1]) with special weight function exp(-(x-x0)^2/const), > operating on irregularly-spaced data in 2d. > > I am not sure if that is the same as what scipy.stats.kde.gaussian_kde does, > the documentation is terse. Can I be enlightened here? It looks like they are both part of a similar class of methods: http://en.wikipedia.org/wiki/Kernel_%28statistics%29 As such, it makes sense to me to have them all in SciPy, sharing code and API. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@... ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Eric Firing - 2009-10-04 23:51:18 ```Václav Šmilauer wrote: > Hi, > > about a year ago I developed for my own purposes a routine for averaging > irregularly-sampled data using gaussian average. I would like to > contribute it to matplotlib, after a clean-up, if there is interest. This sounds like a nice thing to have, but I am wondering whether it should go into scipy instead of mpl. On the other hand, I do see the rationale for simply putting it in mpl so it can be used for contouring etc. without adding an external dependency. > > First it stores all data points in a (regular) grid. One can ask for > value at arbitrary point within the grid range: based on stDev > parameter, all data points within some distance (user-settable) are > weighted using the normal distribution function and then the average is > returned. Storing data in grid avoids searching all data points every > time, only those within certain grid-range will be used. > Good idea. > It would be relatively easy to extend to another distribution types > (lines 120-130), such as elliptic-normal distribution, with different > stDev in each direction. This sounds a lot like "optimal interpolation". Comments? Certainly, in many fields it is essential to be able to specify different std for x and y. > > The code is here: > http://bazaar.launchpad.net/~yade-dev/yade/trunk/annotate/head% > 3A/lib/smoothing/WeightedAverage2d.hpp > > I should be able to convert it to accept numpy arrays, wrap using Python > API instead of boost::python and write some documentation. Let me know > if matplotlib-devel list would be more appropriate for this discussion. matplotlib-users is fine for checking for interest; it may make sense to move more technical discussion to -devel. But I will ignore that for now. Is wrapping with cython a reasonable option? I don't think we have any cython in mpl at present, but it seems to me like the way to go for many things, especially new things. It is readable, flexible, and promises to make the eventual transition to Python 3.1 painless for the components that use it. I realize it is C-oriented, not C++ oriented, but my understanding is that the wrapping of C++ is still reasonable in many cases. (I have not tried it.) Eric > > Cheers, Vaclav > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > _______________________________________________ > Matplotlib-users mailing list > Matplotlib-users@... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: VáclavŠmilauer - 2009-10-05 21:00:25 ```> >> about a year ago I developed for my own purposes a routine for averaging > >> irregularly-sampled data using gaussian average. > > > > is this similar to Kernel Density estimation? > > > > http://www.scipy.org/doc/api_docs/SciPy.stats.kde.gaussian_kde.html > > No. It is probably closer to radial basis function interpolation (in fact, it > almost certainly is a form of RBFs): > > http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html#id1 I checked the official terminology, it is a kernel average smoother (in the sense of [1]) with special weight function exp(-(x-x0)^2/const), operating on irregularly-spaced data in 2d. I am not sure if that is the same as what scipy.stats.kde.gaussian_kde does, the documentation is terse. Can I be enlightened here? Cheers, Vaclav [1] http://en.wikipedia.org/wiki/Kernel_smoothing ```
 Re: [Matplotlib-users] contribution offer: griddata with gaussian average From: Robert Kern - 2009-10-05 21:25:17 ```On 2009-10-05 15:54 PM, VáclavŠmilauer wrote: > I am not sure if that is the same as what scipy.stats.kde.gaussian_kde does, > the documentation is terse. Can I be enlightened here? gaussian_kde does kernel density estimation. While many of the intermediate computations are similar, they have entirely different purposes. http://en.wikipedia.org/wiki/Kernel_density_estimation -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ```