From: Anthony S. <sc...@gm...> - 2013-05-10 16:29:51
|
[dropping scipy-user] Hello Andreas PyTables is a great option and using compression (zlib, blosc, etc) will probably help. Additionally, I would not that since your values are between [0, 100], you can probably get away with using 32-bit floats, rather than 64-bit floats. This size reduction will speed things up, but you probably don't want to go down to 16-bit floats. I would recommend that you store your dataset on disk and then use PyTables Expressions [1,2] with the "out" argument to keep your results on disk as well. If this strategy fails because you need to simultaneously look at multiple indexes in the same array, then I would use partially offset iterators as described in this thread [3]. In both cases, since iterators are automatically chunked, you never read in the whole dataset at one time and what you are interpolating can be as large as you want :). Let us know if you have further specific questions. Be Well Anthony 1. http://pytables.github.io/usersguide/libref.html#the-expr-class-a-general-purpose-expression-evaluator 2. https://github.com/scopatz/hdf5-is-for-lovers/blob/master/hdf5-is-for-lovers.pdf?raw=true 2. "Nested Iteration of HDF5 using PyTables" http://blog.gmane.org/gmane.comp.python.pytables.user/month=20130101 On Fri, May 10, 2013 at 4:58 AM, Andreas Hilboll <li...@hi...> wrote: > Hi, > > I'll have to code multilinear interpolation in n dimensions, n~7. My > data space is quite large, ~10**9 points. The values are given on a > rectangular (but not square) grid. The values are numbers in a range of > approx. [0.0, 100.0]. > > The challenge is to do this efficiently, and it would be great if the > whole thing would be able to run fast on a machine with only 8G (or > better 4G) RAM. > > A common task will be to interpolate 10**6 points, which souldn't take > too long. > > Any ideas on how to do this efficiently are welcome: > > * which dtype to use? > * is using pytables/blosc an option? How can this be integrated in the > interpolation? > * you name it ... ;) > > Cheers, Andreas. > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |