From: Bradford C. <bra...@gm...> - 2007-11-26 19:26:56
|
I started off trying to find a way to use numpy with pytables and the timeseries package in the scipy sandbox. I have been collaborating with the guys who wrote the timeseries package to figure out how we can do it. Here is my latest email to them ... After last night I have cleared away a little more of my pytables ignorance ... datetimes in pytables can be stored in hdf5 time format, which means they must be converted to floats (double-precision floats in our case.) Alternately, we can store datetime objects as strings. So we have a couple options. 1) We can try to read/write/append from numpy arrays to pytables EArrays - since they are homogeneous it means all values of all elements in the time series will need to be double precision floats (which is required for most financial data anyway.) 2) We can write/append the data as Tables, and read as numpy arrays, which gives more flexibility in the data model for time series elements. This model would also allow storing datetimes as strings. My prototype that uses this model si a bit mroe complex, because it requires declaring pytables IsDescription objects to define tables, as well as mapper functions between the table fields and the in memory objects that are elements of the timeseries. A couple of points with respect to the timeseries package, please give me your thoughts: -I'm not sure that either of these fits great with the model in the timeseries package that maintains the datetime as a separate array? -Dates need to be converted to floats for storage in pytables. /brad On Nov 26, 2007 3:33 AM, David Worrall <so...@av...> wrote: > Hi Brad, > > On 26/11/2007, at 12:41 PM, Bradford Cross wrote: > > > Greetings, > > > > I have been working on a prototype for storing large amounts of > > timeseries data in pytables. > > . > > Those experiments did not work out that great > > why not? Please describe the issues. > > > and lead me to try storing timeseries data as Tables, with each row > > representing an observation in the series; the first column is a > > Time64Col and the rest represent the data model for each observation. > > > I do something similar. But I do break it into separate tables where > possible. i.e. where I don't have to do a matrix multiply across > different tables. > > > I am curious what others experiences are and whether I am headed > > down a reasonable path. > > > > /brad > > > > > David > > > > > > > > ---------------------------------------------------------------------- > > --- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > _________________________________________________ > experimental polymedia: www.avatar.com.au > Sonic Communications Research Group, > University of Canberra: creative.canberra.edu.au/scrg/ > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |