From: Jason M. <jk...@uc...> - 2012-10-21 20:55:48
|
This is a PyTables generated file with data collected from vehicle (bicycle) dynamics measurements. Meta data are in tables and time series are stored in array objects. http://mae.ucdavis.edu/~biosport/InstrumentedBicycleData/InstrumentedBicycleData.h5.bz2 It is about 308 mb compressed and 610 mb uncompressed. Jason On Sun, Oct 21, 2012 at 1:01 PM, Andy Wilson <wil...@gm...>wrote: > On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted <fa...@py...> > wrote: > > > Hi, > > > > I'm going to give a tutorial on PyTables next Thursday during the PyData > > conference in New York (http://nyc2012.pydata.org/) and I'd like to use > > some real life data files. So, if you have some public repository with > > data generated with PyTables, please tell me. I'm looking for files > > that are not very large (< 1GB), and that use the Table object > > significantly. A small description of the data included will be more > > that welcome too! > > > > Thanks! > > > > -- > > Francesc Alted > > > > Hi Francesc. > > I've been working on a library for accessing climatology data that > uses pytables to cache data from the USGS. It could easily be used to > create a sample dataset for some area of interest. File size is > determined by how much data gets queried. > > > The general layout is: > > /usgs/sites > - the sites table contains information and metadata about a site > > > /usgs/values/<AGENCY>/<SITE_CODE>/<PARAMETER_CODE> > - a table containing all the timeseries data for each site and > parameter is created as data are queried > - parameter codes are a bit obscure but a dict with descriptive > metadata stashed at table.attrs.variable > - the datetime column has a CSIndex on it and stored as as a string > because some sites have data prior to the year 1901 > - pretty inefficient in terms of disk space (lots of large-ish string > columns) because it handles a very general class of data types > > > Here's what the code would look like to download and create the hdf5 > file for 10 random sites in New York: > > import ulmo > > # the default location for the hdf5 file is OS dependent, so provide > the path you want to use > hdf5_file_path = './usgs_data.h5' > > # get list of sites in NY > ulmo.usgs.pytables.update_site_list(state_code='NY', path=hdf5_file_path) > sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path) > > # download data for a few random sites > for site in sites.keys()[:10]: > ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path) > > > > The project is on github: https://github.com/swtools/ulmo > and the code that does all the pytables stuff (including the table > descriptions) is here: > https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py > > -andy > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > -- Jason K. Moore, Ph.D. Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Sports Biomechanics Lab <http://biosport.ucdavis.edu>, UC Davis Davis Open Science <http://daviswiki.org/Davis_Open_Science> Google Voice: +01 530-601-9791 |