You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(2) |
Mar
|
Apr
(5) |
May
(11) |
Jun
(7) |
Jul
(18) |
Aug
(5) |
Sep
(15) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2004 |
Jan
(5) |
Feb
(2) |
Mar
(5) |
Apr
(8) |
May
(8) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
(20) |
Oct
(11) |
Nov
(31) |
Dec
(41) |
2005 |
Jan
(79) |
Feb
(22) |
Mar
(14) |
Apr
(17) |
May
(35) |
Jun
(24) |
Jul
(26) |
Aug
(9) |
Sep
(57) |
Oct
(64) |
Nov
(25) |
Dec
(37) |
2006 |
Jan
(76) |
Feb
(24) |
Mar
(79) |
Apr
(44) |
May
(33) |
Jun
(12) |
Jul
(15) |
Aug
(40) |
Sep
(17) |
Oct
(21) |
Nov
(46) |
Dec
(23) |
2007 |
Jan
(18) |
Feb
(25) |
Mar
(41) |
Apr
(66) |
May
(18) |
Jun
(29) |
Jul
(40) |
Aug
(32) |
Sep
(34) |
Oct
(17) |
Nov
(46) |
Dec
(17) |
2008 |
Jan
(17) |
Feb
(42) |
Mar
(23) |
Apr
(11) |
May
(65) |
Jun
(28) |
Jul
(28) |
Aug
(16) |
Sep
(24) |
Oct
(33) |
Nov
(16) |
Dec
(5) |
2009 |
Jan
(19) |
Feb
(25) |
Mar
(11) |
Apr
(32) |
May
(62) |
Jun
(28) |
Jul
(61) |
Aug
(20) |
Sep
(61) |
Oct
(11) |
Nov
(14) |
Dec
(53) |
2010 |
Jan
(17) |
Feb
(31) |
Mar
(39) |
Apr
(43) |
May
(49) |
Jun
(47) |
Jul
(35) |
Aug
(58) |
Sep
(55) |
Oct
(91) |
Nov
(77) |
Dec
(63) |
2011 |
Jan
(50) |
Feb
(30) |
Mar
(67) |
Apr
(31) |
May
(17) |
Jun
(83) |
Jul
(17) |
Aug
(33) |
Sep
(35) |
Oct
(19) |
Nov
(29) |
Dec
(26) |
2012 |
Jan
(53) |
Feb
(22) |
Mar
(118) |
Apr
(45) |
May
(28) |
Jun
(71) |
Jul
(87) |
Aug
(55) |
Sep
(30) |
Oct
(73) |
Nov
(41) |
Dec
(28) |
2013 |
Jan
(19) |
Feb
(30) |
Mar
(14) |
Apr
(63) |
May
(20) |
Jun
(59) |
Jul
(40) |
Aug
(33) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Sreeroop <inv...@ya...> - 2012-06-20 22:37:39
|
Thanks Anthony. and Francesc, you (and obviously the team as well!) have made my life easier and i am sure 100's and 1000's of them would agree...... a big Thank You! :) best, Sreeroop ________________________________ From: Anthony Scopatz <sc...@gm...> To: Sreeroop <inv...@ya...>; Discussion list for PyTables <pyt...@li...> Sent: Wednesday, 20 June 2012 9:40 PM Subject: Re: [Pytables-users] Pytables pro Hello Sreeroop, Yes, the merged version with all of the former pro features is now in v2.3.1. Just use this version from now on - and remember to thank Francesc ;) Be Well Anthony On Wed, Jun 20, 2012 at 3:38 PM, Sreeroop <inv...@ya...> wrote: Hi there, > > >I have been using pytables 2.2.1 bsd license version which has immensely boosted performance in handling huge! data sets. I am now planning to use all the good stuff that pro version facilitates (indexing, improved caching, etc...), top of the pro page - i see that since July 2011 the products are merged and there is no more pro and everything is now under bsd license. > > >Would you please advice if i can use the last 2.3.1 release, and can find all the good stuff i am looking for ? > > >Thanks >Sreeroop >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Pytables-users mailing list >Pyt...@li... >https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Anthony S. <sc...@gm...> - 2012-06-20 20:41:23
|
Hello Sreeroop, Yes, the merged version with all of the former pro features is now in v2.3.1. Just use this version from now on - and remember to thank Francesc ;) Be Well Anthony On Wed, Jun 20, 2012 at 3:38 PM, Sreeroop <inv...@ya...> wrote: > Hi there, > > I have been using pytables 2.2.1 bsd license version which has immensely > boosted performance in handling huge! data sets. I am now planning to use > all the good stuff that pro version facilitates (indexing, improved > caching, etc...), top of the pro page - i see that since July 2011 the > products are merged and there is no more pro and everything is now under > bsd license. > > Would you please advice if i can use the last 2.3.1 release, and can find > all the good stuff i am looking for ? > > Thanks > Sreeroop > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Sreeroop <inv...@ya...> - 2012-06-20 20:38:41
|
Hi there, I have been using pytables 2.2.1 bsd license version which has immensely boosted performance in handling huge! data sets. I am now planning to use all the good stuff that pro version facilitates (indexing, improved caching, etc...), top of the pro page - i see that since July 2011 the products are merged and there is no more pro and everything is now under bsd license. Would you please advice if i can use the last 2.3.1 release, and can find all the good stuff i am looking for ? Thanks Sreeroop |
From: Aquil H. A. <aqu...@gm...> - 2012-06-19 17:42:35
|
Thanks. I think that that will do it! On Tue, Jun 19, 2012 at 1:11 PM, Anthony Scopatz <sc...@gm...> wrote: > On Tue, Jun 19, 2012 at 11:07 AM, Aquil H. Abdullah < > aqu...@gm...> wrote: > >> Hello Anthony, >> >> Thanks for your reply. I did a little bit more searching and found the >> discussion datetime objects on tables ( >> http://comments.gmane.org/gmane.comp.python.pytables.user/2469) ,which >> is a bit more recent. In it you list your order of preference for storing >> timestamps: >> >> 1. Create a numpy dtype or PyTables description that matches the >> structure of datetime to your desired precision and save them in a Table. >> If you want to save timezone information as well, I might add an extra >> length-3 string column and save the str representation of the tzinfo field >> ('UTC', 'KST', 'EDT'). >> >> [Could you give an example?] >> >> The best I could come up with so far is to create a column of type >> tables.Time32Col() and then transform my datetime object into a timestamp >> and store it. For example, >> >> In [1]: import tables, pytz >> In [2]: from datetime import datetime, timedelta >> In [3]: from time import mktime >> In [4]: import calendar >> In [5]: dt = datetime.utcnow().replace(tzinfo=pytz.UTC) >> In [6]: dt >> Out[6]: datetime.datetime(2012, 6, 19, 15, 22, 19, 892159, tzinfo=<UTC>) >> In [7]: ts = calendar.timegm(dt.timetuple()) >> In [8]: ts >> Out[8]: 1340119339 >> >> I can then store ts in my table, and retrieve it with >> >> In [35]: datetime.utcfromtimestamp(tbl.cols.timestamp[0]) >> Out[35]: datetime.datetime(2012, 6, 19, 15, 22, 19) >> >> although, at this point I've lost the timezone. >> >> Anyway, I am a little bit confused, because I don't understand the >> difference between a Time32Col() and a Int32Col() in the example that I've >> written. Do you have a better example? >> > > So in terms of how they are stored on disk, Time32Col and Int32Col are > basically the same thing. The difference is that they are flagged as being > times so that other processes will know to interpret them as time stamps > rather than plain old ints. The same thing goes for Time64Col and > Float64Col. > > Now as an example, to capture the full information you want, you will > effectively have to store two columns: one for the time stamp and one for > the time zone. So you description will look like: > > desc = {'timestamp': Time64Col(pos=1), 'tz': StringCol(3, pos=2)} > > Then the entries will look like tuples which match the desc like the > following: > > In [35]: n = datetime.utcnow().replace(tzinfo=pytz.timezone('US/Central')) > > In [36]: time.mktime(n.timetuple()) + n.microsecond * 1e-6, str(n.tzname()) > Out[36]: (1340147092.873432, 'CST') > > Basically, you have to store the tz info next to the timestamp. I hope > this helps! > > Be Well > Anthony > > >> >> Thanks! >> >> >> >> On Tue, Jun 19, 2012 at 2:16 AM, Anthony Scopatz <sc...@gm...>wrote: >> >>> Hey Aquil, >>> >>> Yes, the string method certainly works. The other thing you could do >>> that isn't mentioned in that post is have a table or 2D array whose first >>> column is the float timestamp [1] and whose second column is a string >>> repr of just the timezone name (or a int or float of the offset in sec of >>> the timezone). This will likely be faster. >>> >>> All of the strategies mentioned will work, but will have varying speeds. >>> I personally prefer anything with a timestamp since it relies on the >>> canonical form. >>> >>> Be Well >>> Anthony >>> >>> PS I am sorry that you have to deal with timezones. They are a real >>> pain! >>> >>> 1. http://pytables.github.com/usersguide/libref.html#tables.Time64Col >>> >>> On Mon, Jun 18, 2012 at 10:05 PM, Aquil H. Abdullah < >>> aqu...@gm...> wrote: >>> >>>> I need to store dates and timezone aware datetimes in a PyTable. >>>> Currently, I am storing values of those types as ISO 8601 strings. I ran >>>> across the thread pytables for timeseries data ( >>>> http://osdir.com/ml/python.pytables.user/2007-11/msg00036.html) which >>>> leads me to believe that what I am doing is the best way to store these >>>> types of values, but I just wanted to check in case I am missing something. >>>> >>>> Regards, >>>> >>>> -- >>>> Aquil H. Abdullah >>>> aqu...@gm... >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Live Security Virtual Conference >>>> Exclusive live event will cover all the ways today's security and >>>> threat landscape has changed and how IT managers can respond. >>>> Discussions >>>> will include endpoint security, mobile security and the latest in >>>> malware >>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>>> _______________________________________________ >>>> Pytables-users mailing list >>>> Pyt...@li... >>>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Pytables-users mailing list >>> Pyt...@li... >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> >>> >> >> >> -- >> Aquil H. Abdullah >> aqu...@gm... >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > -- Aquil H. Abdullah aqu...@gm... |
From: Anthony S. <sc...@gm...> - 2012-06-19 17:12:05
|
On Tue, Jun 19, 2012 at 11:07 AM, Aquil H. Abdullah < aqu...@gm...> wrote: > Hello Anthony, > > Thanks for your reply. I did a little bit more searching and found the > discussion datetime objects on tables ( > http://comments.gmane.org/gmane.comp.python.pytables.user/2469) ,which is > a bit more recent. In it you list your order of preference for storing > timestamps: > > 1. Create a numpy dtype or PyTables description that matches the structure > of datetime to your desired precision and save them in a Table. If you want > to save timezone information as well, I might add an extra length-3 string > column and save the str representation of the tzinfo field ('UTC', 'KST', > 'EDT'). > > [Could you give an example?] > > The best I could come up with so far is to create a column of type > tables.Time32Col() and then transform my datetime object into a timestamp > and store it. For example, > > In [1]: import tables, pytz > In [2]: from datetime import datetime, timedelta > In [3]: from time import mktime > In [4]: import calendar > In [5]: dt = datetime.utcnow().replace(tzinfo=pytz.UTC) > In [6]: dt > Out[6]: datetime.datetime(2012, 6, 19, 15, 22, 19, 892159, tzinfo=<UTC>) > In [7]: ts = calendar.timegm(dt.timetuple()) > In [8]: ts > Out[8]: 1340119339 > > I can then store ts in my table, and retrieve it with > > In [35]: datetime.utcfromtimestamp(tbl.cols.timestamp[0]) > Out[35]: datetime.datetime(2012, 6, 19, 15, 22, 19) > > although, at this point I've lost the timezone. > > Anyway, I am a little bit confused, because I don't understand the > difference between a Time32Col() and a Int32Col() in the example that I've > written. Do you have a better example? > So in terms of how they are stored on disk, Time32Col and Int32Col are basically the same thing. The difference is that they are flagged as being times so that other processes will know to interpret them as time stamps rather than plain old ints. The same thing goes for Time64Col and Float64Col. Now as an example, to capture the full information you want, you will effectively have to store two columns: one for the time stamp and one for the time zone. So you description will look like: desc = {'timestamp': Time64Col(pos=1), 'tz': StringCol(3, pos=2)} Then the entries will look like tuples which match the desc like the following: In [35]: n = datetime.utcnow().replace(tzinfo=pytz.timezone('US/Central')) In [36]: time.mktime(n.timetuple()) + n.microsecond * 1e-6, str(n.tzname()) Out[36]: (1340147092.873432, 'CST') Basically, you have to store the tz info next to the timestamp. I hope this helps! Be Well Anthony > > Thanks! > > > > On Tue, Jun 19, 2012 at 2:16 AM, Anthony Scopatz <sc...@gm...>wrote: > >> Hey Aquil, >> >> Yes, the string method certainly works. The other thing you could do >> that isn't mentioned in that post is have a table or 2D array whose first >> column is the float timestamp [1] and whose second column is a string >> repr of just the timezone name (or a int or float of the offset in sec of >> the timezone). This will likely be faster. >> >> All of the strategies mentioned will work, but will have varying speeds. >> I personally prefer anything with a timestamp since it relies on the >> canonical form. >> >> Be Well >> Anthony >> >> PS I am sorry that you have to deal with timezones. They are a real pain! >> >> 1. http://pytables.github.com/usersguide/libref.html#tables.Time64Col >> >> On Mon, Jun 18, 2012 at 10:05 PM, Aquil H. Abdullah < >> aqu...@gm...> wrote: >> >>> I need to store dates and timezone aware datetimes in a PyTable. >>> Currently, I am storing values of those types as ISO 8601 strings. I ran >>> across the thread pytables for timeseries data ( >>> http://osdir.com/ml/python.pytables.user/2007-11/msg00036.html) which >>> leads me to believe that what I am doing is the best way to store these >>> types of values, but I just wanted to check in case I am missing something. >>> >>> Regards, >>> >>> -- >>> Aquil H. Abdullah >>> aqu...@gm... >>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Pytables-users mailing list >>> Pyt...@li... >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> >>> >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > -- > Aquil H. Abdullah > aqu...@gm... > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Aquil H. A. <aqu...@gm...> - 2012-06-19 16:08:07
|
Hello Anthony, Thanks for your reply. I did a little bit more searching and found the discussion datetime objects on tables ( http://comments.gmane.org/gmane.comp.python.pytables.user/2469) ,which is a bit more recent. In it you list your order of preference for storing timestamps: 1. Create a numpy dtype or PyTables description that matches the structure of datetime to your desired precision and save them in a Table. If you want to save timezone information as well, I might add an extra length-3 string column and save the str representation of the tzinfo field ('UTC', 'KST', 'EDT'). [Could you give an example?] The best I could come up with so far is to create a column of type tables.Time32Col() and then transform my datetime object into a timestamp and store it. For example, In [1]: import tables, pytz In [2]: from datetime import datetime, timedelta In [3]: from time import mktime In [4]: import calendar In [5]: dt = datetime.utcnow().replace(tzinfo=pytz.UTC) In [6]: dt Out[6]: datetime.datetime(2012, 6, 19, 15, 22, 19, 892159, tzinfo=<UTC>) In [7]: ts = calendar.timegm(dt.timetuple()) In [8]: ts Out[8]: 1340119339 I can then store ts in my table, and retrieve it with In [35]: datetime.utcfromtimestamp(tbl.cols.timestamp[0]) Out[35]: datetime.datetime(2012, 6, 19, 15, 22, 19) although, at this point I've lost the timezone. Anyway, I am a little bit confused, because I don't understand the difference between a Time32Col() and a Int32Col() in the example that I've written. Do you have a better example? Thanks! On Tue, Jun 19, 2012 at 2:16 AM, Anthony Scopatz <sc...@gm...> wrote: > Hey Aquil, > > Yes, the string method certainly works. The other thing you could do that > isn't mentioned in that post is have a table or 2D array whose first > column is the float timestamp [1] and whose second column is a string repr > of just the timezone name (or a int or float of the offset in sec of the > timezone). This will likely be faster. > > All of the strategies mentioned will work, but will have varying speeds. > I personally prefer anything with a timestamp since it relies on the > canonical form. > > Be Well > Anthony > > PS I am sorry that you have to deal with timezones. They are a real pain! > > 1. http://pytables.github.com/usersguide/libref.html#tables.Time64Col > > On Mon, Jun 18, 2012 at 10:05 PM, Aquil H. Abdullah < > aqu...@gm...> wrote: > >> I need to store dates and timezone aware datetimes in a PyTable. >> Currently, I am storing values of those types as ISO 8601 strings. I ran >> across the thread pytables for timeseries data ( >> http://osdir.com/ml/python.pytables.user/2007-11/msg00036.html) which >> leads me to believe that what I am doing is the best way to store these >> types of values, but I just wanted to check in case I am missing something. >> >> Regards, >> >> -- >> Aquil H. Abdullah >> aqu...@gm... >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > -- Aquil H. Abdullah aqu...@gm... |
From: Anthony S. <sc...@gm...> - 2012-06-19 06:17:01
|
Hey Aquil, Yes, the string method certainly works. The other thing you could do that isn't mentioned in that post is have a table or 2D array whose first column is the float timestamp [1] and whose second column is a string repr of just the timezone name (or a int or float of the offset in sec of the timezone). This will likely be faster. All of the strategies mentioned will work, but will have varying speeds. I personally prefer anything with a timestamp since it relies on the canonical form. Be Well Anthony PS I am sorry that you have to deal with timezones. They are a real pain! 1. http://pytables.github.com/usersguide/libref.html#tables.Time64Col On Mon, Jun 18, 2012 at 10:05 PM, Aquil H. Abdullah < aqu...@gm...> wrote: > I need to store dates and timezone aware datetimes in a PyTable. > Currently, I am storing values of those types as ISO 8601 strings. I ran > across the thread pytables for timeseries data ( > http://osdir.com/ml/python.pytables.user/2007-11/msg00036.html) which > leads me to believe that what I am doing is the best way to store these > types of values, but I just wanted to check in case I am missing something. > > Regards, > > -- > Aquil H. Abdullah > aqu...@gm... > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Anthony S. <sc...@gm...> - 2012-06-19 03:16:16
|
Hmmm That all looks correct. Have you tried linking PyTables against the HDF5 from MacPorts? This has worked for me in the past... On Mon, Jun 18, 2012 at 9:37 PM, David Donovan <don...@gm...> wrote: > Hi Anthony, > > Thanks for the response. Installed HDF5 1.8.9 using the following > flags for configure. > > `./configure --prefix=/usr/local > --with-szlib=/Library/Frameworks/Python.framework/Versions/Current > CPPFLAGS=-I/Library/Frameworks/Python.framework/Versions/Current/include > LDFLAGS=-L/Library/Frameworks/Python.framework/Versions/Current/lib` > > Also, I had to modify the optimization flag for gcc-4 in order to pass > the make check part (as noted on the HDF5 page). > `Conversion Tests fail on Mac OS X 10.7 (Lion) Users have reported > that when building HDF5, the conversion tests failed (make check) in > dt_arith.chk. A workaround is to edit ./< HDF5source > >/config/gnu-flags, search for PROD_CFLAGS under "gcc-4.*", and change > the value for PROD_CFLAGS to "-O0".` > > Then: > 'make' > 'make check' > 'sudo make install' > > Is there a better way? Is tables somehow having a hard time finding > the HDF5 library do you think? > > Thanks! > > Best Regards, > David > > > > On Sat, Jun 16, 2012 at 12:57 AM, Anthony Scopatz <sc...@gm...> > wrote: > > Hi David, > > > > How did you build / install HDF5? > > > > Be Well > > Anthony > > > > On Fri, Jun 15, 2012 at 7:14 PM, David Donovan <don...@gm...> > wrote: > >> > >> Hi Everyone, > >> > >> I am having problems running the tests for PyTables on Mac OS X Lion. > >> I have tried HDF5 version 1.8.5 as well, but I still get the same issue. > >> > >> Any thoughts would be helpful... > >> > >> Thanks for any help you can provide. > >> > >> Best Regards, > >> David Donovan > >> > >> > >> impPython 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) > >> [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on > >> darwin > >> Type "help", "copyright", "credits" or "license" for more information. > >> >>> import tables > >> table>>> tables.test() > >> > >> > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > >> PyTables version: 2.3.1 > >> HDF5 version: 1.8.9 > >> NumPy version: 1.7.0.dev-0c5f480 > >> Numexpr version: 2.0.1 (not using Intel's VML/MKL) > >> Zlib version: 1.2.5 (in Python interpreter) > >> LZO version: 2.06 (Aug 12 2011) > >> BZIP2 version: 1.0.6 (6-Sept-2010) > >> Blosc version: 1.1.2 (2010-11-04) > >> Cython version: 0.16 > >> Python version: 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) > >> [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] > >> Platform: darwin-i386 > >> Byte-ordering: little > >> Detected cores: 2 > >> > >> > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > >> Performing only a light (yet comprehensive) subset of the test suite. > >> If you want a more complete test, try passing the --heavy flag to this > >> script > >> (or set the 'heavy' parameter in case you are using tables.test() call). > >> The whole suite will take more than 4 hours to complete on a relatively > >> modern CPU and around 512 MB of main memory. > >> > >> > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > >> > >> > .....................................................................................................................................FSegmentation > >> fault: 11 > >> > >> > >> > ------------------------------------------------------------------------------ > >> Live Security Virtual Conference > >> Exclusive live event will cover all the ways today's security and > >> threat landscape has changed and how IT managers can respond. > Discussions > >> will include endpoint security, mobile security and the latest in > malware > >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >> _______________________________________________ > >> Pytables-users mailing list > >> Pyt...@li... > >> https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Aquil H. A. <aqu...@gm...> - 2012-06-19 03:05:49
|
I need to store dates and timezone aware datetimes in a PyTable. Currently, I am storing values of those types as ISO 8601 strings. I ran across the thread pytables for timeseries data ( http://osdir.com/ml/python.pytables.user/2007-11/msg00036.html) which leads me to believe that what I am doing is the best way to store these types of values, but I just wanted to check in case I am missing something. Regards, -- Aquil H. Abdullah aqu...@gm... |
From: David D. <don...@gm...> - 2012-06-19 02:37:25
|
Hi Anthony, Thanks for the response. Installed HDF5 1.8.9 using the following flags for configure. `./configure --prefix=/usr/local --with-szlib=/Library/Frameworks/Python.framework/Versions/Current CPPFLAGS=-I/Library/Frameworks/Python.framework/Versions/Current/include LDFLAGS=-L/Library/Frameworks/Python.framework/Versions/Current/lib` Also, I had to modify the optimization flag for gcc-4 in order to pass the make check part (as noted on the HDF5 page). `Conversion Tests fail on Mac OS X 10.7 (Lion) Users have reported that when building HDF5, the conversion tests failed (make check) in dt_arith.chk. A workaround is to edit ./< HDF5source >/config/gnu-flags, search for PROD_CFLAGS under "gcc-4.*", and change the value for PROD_CFLAGS to "-O0".` Then: 'make' 'make check' 'sudo make install' Is there a better way? Is tables somehow having a hard time finding the HDF5 library do you think? Thanks! Best Regards, David On Sat, Jun 16, 2012 at 12:57 AM, Anthony Scopatz <sc...@gm...> wrote: > Hi David, > > How did you build / install HDF5? > > Be Well > Anthony > > On Fri, Jun 15, 2012 at 7:14 PM, David Donovan <don...@gm...> wrote: >> >> Hi Everyone, >> >> I am having problems running the tests for PyTables on Mac OS X Lion. >> I have tried HDF5 version 1.8.5 as well, but I still get the same issue. >> >> Any thoughts would be helpful... >> >> Thanks for any help you can provide. >> >> Best Regards, >> David Donovan >> >> >> impPython 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) >> [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on >> darwin >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import tables >> table>>> tables.test() >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >> PyTables version: 2.3.1 >> HDF5 version: 1.8.9 >> NumPy version: 1.7.0.dev-0c5f480 >> Numexpr version: 2.0.1 (not using Intel's VML/MKL) >> Zlib version: 1.2.5 (in Python interpreter) >> LZO version: 2.06 (Aug 12 2011) >> BZIP2 version: 1.0.6 (6-Sept-2010) >> Blosc version: 1.1.2 (2010-11-04) >> Cython version: 0.16 >> Python version: 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) >> [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] >> Platform: darwin-i386 >> Byte-ordering: little >> Detected cores: 2 >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >> Performing only a light (yet comprehensive) subset of the test suite. >> If you want a more complete test, try passing the --heavy flag to this >> script >> (or set the 'heavy' parameter in case you are using tables.test() call). >> The whole suite will take more than 4 hours to complete on a relatively >> modern CPU and around 512 MB of main memory. >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >> >> .....................................................................................................................................FSegmentation >> fault: 11 >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Anthony S. <sc...@gm...> - 2012-06-17 02:00:19
|
On Sat, Jun 16, 2012 at 8:50 PM, Andre' Walker-Loud <wal...@gm...>wrote: > Hi Anthony, > > I forgot to say Thanks! > > I tried using the _v_depth attr but that didn't give me an answer I > understood. for example, the _v_depth of f.root was 0, and the _v_depth of > my final data path was 2. But I have something that works now. > > also - help like this > > > Just one quick comment. You probably shouldn't test the string of the > type of the data. > > Use the builtin isinstance() instead: > > > > found_array = isinstance(data, tables.Array) > > is very helpful to me. I have not been properly trained in any > programming, I have just hacked as needed for work/research, so things like > this are not yet common for me to realize. > No worries, that is what we are here for ;) > > > Cheers, > > Andre > > > > > > On Jun 14, 2012, at 3:28 PM, Anthony Scopatz wrote: > > > On Thu, Jun 14, 2012 at 4:30 PM, Andre' Walker-Loud <wal...@gm...> > wrote: > > Hi Anthony, > > > > On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > > > > > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud < > wal...@gm...> wrote: > > > Hi All, > > > > > > Still trying to sort out a recursive walk through an hdf5 file using > pytables. > > > > > > I have an hdf5 file with an unknown depth of groups/nodes. > > > > > > I am trying to write a little function to walk down the tree (with > user input help) until a data file is found. > > > > > > I am hoping there is some function one can use to query whether you > have found simply a group/node or an actual numpy array of data. So I can > do something like > > > > > > if f.getNode('/',some_path) == "data_array": > > > return f.getNode('/',some_path), True > > > else: > > > return f.getNode('/',some_path), False > > > > > > where I have some function that if the second returned variable is > True, will recognize the file as data, where as if it is False, it will > query the user for a further path down the tree. > > > > > > > > > I suppose I could set this up with a try: except: but was hoping there > is some built in functionality to handle this. > > > > > > Yup, I think that you are looking for the File.walkNodes() method. > http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes > > > > I wasn't sure how to use walkNodes in an interactive search. Here is > what I came up with so far (it works on test cases I have given it). > Comments are welcome. > > > > One feature I would like to add to the while loop in the second function > is an iterator counting the depth of the search. I want to compare this to > the maximum tree/node/group depth in the file, so if the search goes over > (maybe my collaborators used createTable instead of createArray) the while > loop won't run forever. > > > > Is there a function to ask the deepest recursion into the hdf5 file? > > > > Hello Andre, > > > > Every Node object has a _v_depth attr that you can access ( > http://pytables.github.com/usersguide/libref.html#tables.Node._v_depth). > In your walk function, therefore, you could test to see if you are over or > under the maximal value that you set. > > > > > > Cheers, > > > > Andre > > > > > > def is_array(file,path): > > data = file.getNode(path) > > if str(type(data)) == "<class 'tables.array.Array'>": > > found_array = True > > > > Just one quick comment. You probably shouldn't test the string of the > type of the data. > > Use the builtin isinstance() instead: > > > > found_array = isinstance(data, tables.Array) > > > > Be Well > > Anthony > > > > else: > > found_array = False > > for g in file.getNode(path): > > print g > > return data, found_array > > > > def pytable_walk(file): > > found_data = False > > path = '' > > while found_data == False: > > for g in file.getNode('/',path): > > print g > > path_new = raw_input('which node would you like?\n ') > > path = path+'/'+path_new > > data,found_data = is_array(file,path) > > return path,data > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. > http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/_______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Andre' Walker-L. <wal...@gm...> - 2012-06-17 01:50:44
|
Hi Anthony, I forgot to say Thanks! I tried using the _v_depth attr but that didn't give me an answer I understood. for example, the _v_depth of f.root was 0, and the _v_depth of my final data path was 2. But I have something that works now. also - help like this > Just one quick comment. You probably shouldn't test the string of the type of the data. > Use the builtin isinstance() instead: > > found_array = isinstance(data, tables.Array) is very helpful to me. I have not been properly trained in any programming, I have just hacked as needed for work/research, so things like this are not yet common for me to realize. Cheers, Andre On Jun 14, 2012, at 3:28 PM, Anthony Scopatz wrote: > On Thu, Jun 14, 2012 at 4:30 PM, Andre' Walker-Loud <wal...@gm...> wrote: > Hi Anthony, > > On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > > > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud <wal...@gm...> wrote: > > Hi All, > > > > Still trying to sort out a recursive walk through an hdf5 file using pytables. > > > > I have an hdf5 file with an unknown depth of groups/nodes. > > > > I am trying to write a little function to walk down the tree (with user input help) until a data file is found. > > > > I am hoping there is some function one can use to query whether you have found simply a group/node or an actual numpy array of data. So I can do something like > > > > if f.getNode('/',some_path) == "data_array": > > return f.getNode('/',some_path), True > > else: > > return f.getNode('/',some_path), False > > > > where I have some function that if the second returned variable is True, will recognize the file as data, where as if it is False, it will query the user for a further path down the tree. > > > > > > I suppose I could set this up with a try: except: but was hoping there is some built in functionality to handle this. > > > > Yup, I think that you are looking for the File.walkNodes() method. http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes > > I wasn't sure how to use walkNodes in an interactive search. Here is what I came up with so far (it works on test cases I have given it). Comments are welcome. > > One feature I would like to add to the while loop in the second function is an iterator counting the depth of the search. I want to compare this to the maximum tree/node/group depth in the file, so if the search goes over (maybe my collaborators used createTable instead of createArray) the while loop won't run forever. > > Is there a function to ask the deepest recursion into the hdf5 file? > > Hello Andre, > > Every Node object has a _v_depth attr that you can access (http://pytables.github.com/usersguide/libref.html#tables.Node._v_depth). In your walk function, therefore, you could test to see if you are over or under the maximal value that you set. > > > Cheers, > > Andre > > > def is_array(file,path): > data = file.getNode(path) > if str(type(data)) == "<class 'tables.array.Array'>": > found_array = True > > Just one quick comment. You probably shouldn't test the string of the type of the data. > Use the builtin isinstance() instead: > > found_array = isinstance(data, tables.Array) > > Be Well > Anthony > > else: > found_array = False > for g in file.getNode(path): > print g > return data, found_array > > def pytable_walk(file): > found_data = False > path = '' > while found_data == False: > for g in file.getNode('/',path): > print g > path_new = raw_input('which node would you like?\n ') > path = path+'/'+path_new > data,found_data = is_array(file,path) > return path,data > > > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/_______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users |
From: Anthony S. <sc...@gm...> - 2012-06-16 07:58:03
|
Hi David, How did you build / install HDF5? Be Well Anthony On Fri, Jun 15, 2012 at 7:14 PM, David Donovan <don...@gm...> wrote: > Hi Everyone, > > I am having problems running the tests for PyTables on Mac OS X Lion. > I have tried HDF5 version 1.8.5 as well, but I still get the same issue. > > Any thoughts would be helpful... > > Thanks for any help you can provide. > > Best Regards, > David Donovan > > > impPython 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) > [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on > darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import tables > table>>> tables.test() > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > PyTables version: 2.3.1 > HDF5 version: 1.8.9 > NumPy version: 1.7.0.dev-0c5f480 > Numexpr version: 2.0.1 (not using Intel's VML/MKL) > Zlib version: 1.2.5 (in Python interpreter) > LZO version: 2.06 (Aug 12 2011) > BZIP2 version: 1.0.6 (6-Sept-2010) > Blosc version: 1.1.2 (2010-11-04) > Cython version: 0.16 > Python version: 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) > [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] > Platform: darwin-i386 > Byte-ordering: little > Detected cores: 2 > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Performing only a light (yet comprehensive) subset of the test suite. > If you want a more complete test, try passing the --heavy flag to this > script > (or set the 'heavy' parameter in case you are using tables.test() call). > The whole suite will take more than 4 hours to complete on a relatively > modern CPU and around 512 MB of main memory. > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > .....................................................................................................................................FSegmentation > fault: 11 > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: David D. <don...@gm...> - 2012-06-16 00:14:39
|
Hi Everyone, I am having problems running the tests for PyTables on Mac OS X Lion. I have tried HDF5 version 1.8.5 as well, but I still get the same issue. Any thoughts would be helpful... Thanks for any help you can provide. Best Regards, David Donovan impPython 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tables table>>> tables.test() -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= PyTables version: 2.3.1 HDF5 version: 1.8.9 NumPy version: 1.7.0.dev-0c5f480 Numexpr version: 2.0.1 (not using Intel's VML/MKL) Zlib version: 1.2.5 (in Python interpreter) LZO version: 2.06 (Aug 12 2011) BZIP2 version: 1.0.6 (6-Sept-2010) Blosc version: 1.1.2 (2010-11-04) Cython version: 0.16 Python version: 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] Platform: darwin-i386 Byte-ordering: little Detected cores: 2 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Performing only a light (yet comprehensive) subset of the test suite. If you want a more complete test, try passing the --heavy flag to this script (or set the 'heavy' parameter in case you are using tables.test() call). The whole suite will take more than 4 hours to complete on a relatively modern CPU and around 512 MB of main memory. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= .....................................................................................................................................FSegmentation fault: 11 |
From: Anthony S. <sc...@gm...> - 2012-06-14 22:28:41
|
On Thu, Jun 14, 2012 at 4:30 PM, Andre' Walker-Loud <wal...@gm...>wrote: > Hi Anthony, > > On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > > > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud <wal...@gm...> > wrote: > > Hi All, > > > > Still trying to sort out a recursive walk through an hdf5 file using > pytables. > > > > I have an hdf5 file with an unknown depth of groups/nodes. > > > > I am trying to write a little function to walk down the tree (with user > input help) until a data file is found. > > > > I am hoping there is some function one can use to query whether you have > found simply a group/node or an actual numpy array of data. So I can do > something like > > > > if f.getNode('/',some_path) == "data_array": > > return f.getNode('/',some_path), True > > else: > > return f.getNode('/',some_path), False > > > > where I have some function that if the second returned variable is True, > will recognize the file as data, where as if it is False, it will query the > user for a further path down the tree. > > > > > > I suppose I could set this up with a try: except: but was hoping there > is some built in functionality to handle this. > > > > Yup, I think that you are looking for the File.walkNodes() method. > http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes > > I wasn't sure how to use walkNodes in an interactive search. Here is what > I came up with so far (it works on test cases I have given it). Comments > are welcome. > > One feature I would like to add to the while loop in the second function > is an iterator counting the depth of the search. I want to compare this to > the maximum tree/node/group depth in the file, so if the search goes over > (maybe my collaborators used createTable instead of createArray) the while > loop won't run forever. > > Is there a function to ask the deepest recursion into the hdf5 file? > Hello Andre, Every Node object has a _v_depth attr that you can access ( http://pytables.github.com/usersguide/libref.html#tables.Node._v_depth). In your walk function, therefore, you could test to see if you are over or under the maximal value that you set. > > Cheers, > > Andre > > > def is_array(file,path): > data = file.getNode(path) > if str(type(data)) == "<class 'tables.array.Array'>": > found_array = True > Just one quick comment. You probably shouldn't test the string of the type of the data. Use the builtin isinstance() instead: found_array = isinstance(data, tables.Array) Be Well Anthony > else: > found_array = False > for g in file.getNode(path): > print g > return data, found_array > > def pytable_walk(file): > found_data = False > path = '' > while found_data == False: > for g in file.getNode('/',path): > print g > path_new = raw_input('which node would you like?\n ') > path = path+'/'+path_new > data,found_data = is_array(file,path) > return path,data > > > > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Andre' Walker-L. <wal...@gm...> - 2012-06-14 21:30:13
|
Hi Anthony, On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud <wal...@gm...> wrote: > Hi All, > > Still trying to sort out a recursive walk through an hdf5 file using pytables. > > I have an hdf5 file with an unknown depth of groups/nodes. > > I am trying to write a little function to walk down the tree (with user input help) until a data file is found. > > I am hoping there is some function one can use to query whether you have found simply a group/node or an actual numpy array of data. So I can do something like > > if f.getNode('/',some_path) == "data_array": > return f.getNode('/',some_path), True > else: > return f.getNode('/',some_path), False > > where I have some function that if the second returned variable is True, will recognize the file as data, where as if it is False, it will query the user for a further path down the tree. > > > I suppose I could set this up with a try: except: but was hoping there is some built in functionality to handle this. > > Yup, I think that you are looking for the File.walkNodes() method. http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes I wasn't sure how to use walkNodes in an interactive search. Here is what I came up with so far (it works on test cases I have given it). Comments are welcome. One feature I would like to add to the while loop in the second function is an iterator counting the depth of the search. I want to compare this to the maximum tree/node/group depth in the file, so if the search goes over (maybe my collaborators used createTable instead of createArray) the while loop won't run forever. Is there a function to ask the deepest recursion into the hdf5 file? Cheers, Andre def is_array(file,path): data = file.getNode(path) if str(type(data)) == "<class 'tables.array.Array'>": found_array = True else: found_array = False for g in file.getNode(path): print g return data, found_array def pytable_walk(file): found_data = False path = '' while found_data == False: for g in file.getNode('/',path): print g path_new = raw_input('which node would you like?\n ') path = path+'/'+path_new data,found_data = is_array(file,path) return path,data |
From: Anthony S. <sc...@gm...> - 2012-06-14 18:30:35
|
On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud <wal...@gm...>wrote: > Hi All, > > Still trying to sort out a recursive walk through an hdf5 file using > pytables. > > I have an hdf5 file with an unknown depth of groups/nodes. > > I am trying to write a little function to walk down the tree (with user > input help) until a data file is found. > > I am hoping there is some function one can use to query whether you have > found simply a group/node or an actual numpy array of data. So I can do > something like > > if f.getNode('/',some_path) == "data_array": > return f.getNode('/',some_path), True > else: > return f.getNode('/',some_path), False > > where I have some function that if the second returned variable is True, > will recognize the file as data, where as if it is False, it will query the > user for a further path down the tree. > > > I suppose I could set this up with a try: except: but was hoping there is > some built in functionality to handle this. > Yup, I think that you are looking for the File.walkNodes() method. http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes Be Well Anthony > > > Thanks, > > Andre > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Andre' Walker-L. <wal...@gm...> - 2012-06-14 01:24:07
|
Hi All, Still trying to sort out a recursive walk through an hdf5 file using pytables. I have an hdf5 file with an unknown depth of groups/nodes. I am trying to write a little function to walk down the tree (with user input help) until a data file is found. I am hoping there is some function one can use to query whether you have found simply a group/node or an actual numpy array of data. So I can do something like if f.getNode('/',some_path) == "data_array": return f.getNode('/',some_path), True else: return f.getNode('/',some_path), False where I have some function that if the second returned variable is True, will recognize the file as data, where as if it is False, it will query the user for a further path down the tree. I suppose I could set this up with a try: except: but was hoping there is some built in functionality to handle this. Thanks, Andre |
From: Aquil H. A. <aqu...@gm...> - 2012-06-12 14:46:57
|
Hello Anthony, Thanks for the reply! I was confused about the differences between in-memory objects and file attributes. I was able to add an EntitiyInfo class as an HDF5 file attribute and the information persisted. class EntityInfo(object): last_update = '2012-06-11T00:00:00Z' e_info = EntityInfo() h5file.root.data.dataset1.e_info = e_info ... h5file.close() And when I opened the file again I was able to access the e_info object on the table. On Mon, Jun 11, 2012 at 3:47 PM, Anthony Scopatz <sc...@gm...> wrote: > On Mon, Jun 11, 2012 at 2:00 PM, Aquil H. Abdullah < > aqu...@gm...> wrote: > >> Hello All, >> >> I've recently started using PyTables and I am very excited about it's >> speed and ease of use for large datasets, however, I have a problem that I >> have not been able to solve with regards to user defined table attributes. >> >> I have a table that contains observations about of entities that can be >> classified as different types. The timestamp for the last observation of >> these entities may be different. For processing, this table I would like to >> be able to determine the timestamp of the last observation for each of >> these entities. The problem is easy as long as I know the entity types. >> For example: >> >> import tables >> h5file = tables.openFile('data.h5',mode='r+') >> tbl = h5file.getNode('/series','data1') >> last_obs = max(x['timestamp'] for x in tbl.where("""entity_type=='e1'""")) >> >> However, my problems is that as I read from my source I may not always >> know the entity type before hand. I was going to add a last_observation >> attribute to my table, however, I found the link >> https://github.com/PyTables/PyTables/issues/145, which says that >> attributes aren't persistent. >> > > Hello Aquil, > > This issue only applies to instance attrs on the in-memory object. > > >> So I have two questions: >> >> 1. Are there any user-defined attributes that are persistent? >> > > Yes, these are the HDF5 attributes of a node. You have to access them > through the "attrs" namespace. To use your example above: > > tbl.attrs.last_obs = 42.0 > > See > http://pytables.github.com/usersguide/libref.html?highlight=attrs#the-attributeset-class for > more info. > > >> 2. Does anyone have any other suggestions? Besides separating the >> entities into separate tables where I could then just do a max on the >> timestamp field/col? >> > > You could also use numpy.unique() to figure out the entity values and > then and itertools.groupby() to separate the data out. (groupby might not > be the fastest thing to do here.) Or just use the where() method from > above for each entity. The point is that you want the unique of the entity > type column only: > > entity_types = np.unique(tbl.cols.entity_type) > > Another thing is that if the times are roughly chronological, and entities > are evenly dispersed, you could probably get away with only reading in the > end of the table and make things faster: > > entity_types = np.unique(tbl.cols.entity_type[-100:]) > > I hope this helps! > Be Well > Anthony > > >> -- >> Aquil H. Abdullah >> aqu...@gm... >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > -- Aquil H. Abdullah aqu...@gm... |
From: Anthony S. <sc...@gm...> - 2012-06-11 19:48:03
|
On Mon, Jun 11, 2012 at 2:00 PM, Aquil H. Abdullah <aqu...@gm... > wrote: > Hello All, > > I've recently started using PyTables and I am very excited about it's > speed and ease of use for large datasets, however, I have a problem that I > have not been able to solve with regards to user defined table attributes. > > I have a table that contains observations about of entities that can be > classified as different types. The timestamp for the last observation of > these entities may be different. For processing, this table I would like to > be able to determine the timestamp of the last observation for each of > these entities. The problem is easy as long as I know the entity types. > For example: > > import tables > h5file = tables.openFile('data.h5',mode='r+') > tbl = h5file.getNode('/series','data1') > last_obs = max(x['timestamp'] for x in tbl.where("""entity_type=='e1'""")) > > However, my problems is that as I read from my source I may not always > know the entity type before hand. I was going to add a last_observation > attribute to my table, however, I found the link > https://github.com/PyTables/PyTables/issues/145, which says that > attributes aren't persistent. > Hello Aquil, This issue only applies to instance attrs on the in-memory object. > So I have two questions: > > 1. Are there any user-defined attributes that are persistent? > Yes, these are the HDF5 attributes of a node. You have to access them through the "attrs" namespace. To use your example above: tbl.attrs.last_obs = 42.0 See http://pytables.github.com/usersguide/libref.html?highlight=attrs#the-attributeset-class for more info. > 2. Does anyone have any other suggestions? Besides separating the entities > into separate tables where I could then just do a max on the timestamp > field/col? > You could also use numpy.unique() to figure out the entity values and then and itertools.groupby() to separate the data out. (groupby might not be the fastest thing to do here.) Or just use the where() method from above for each entity. The point is that you want the unique of the entity type column only: entity_types = np.unique(tbl.cols.entity_type) Another thing is that if the times are roughly chronological, and entities are evenly dispersed, you could probably get away with only reading in the end of the table and make things faster: entity_types = np.unique(tbl.cols.entity_type[-100:]) I hope this helps! Be Well Anthony > -- > Aquil H. Abdullah > aqu...@gm... > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Aquil H. A. <aqu...@gm...> - 2012-06-11 19:00:25
|
Hello All, I've recently started using PyTables and I am very excited about it's speed and ease of use for large datasets, however, I have a problem that I have not been able to solve with regards to user defined table attributes. I have a table that contains observations about of entities that can be classified as different types. The timestamp for the last observation of these entities may be different. For processing, this table I would like to be able to determine the timestamp of the last observation for each of these entities. The problem is easy as long as I know the entity types. For example: import tables h5file = tables.openFile('data.h5',mode='r+') tbl = h5file.getNode('/series','data1') last_obs = max(x['timestamp'] for x in tbl.where("""entity_type=='e1'""")) However, my problems is that as I read from my source I may not always know the entity type before hand. I was going to add a last_observation attribute to my table, however, I found the link https://github.com/PyTables/PyTables/issues/145, which says that attributes aren't persistent. So I have two questions: 1. Are there any user-defined attributes that are persistent? 2. Does anyone have any other suggestions? Besides separating the entities into separate tables where I could then just do a max on the timestamp field/col? -- Aquil H. Abdullah aqu...@gm... |
From: Andre' Walker-L. <wal...@gm...> - 2012-06-08 06:49:53
|
Hi All, I have a question about reading through a file in a smart way. I am trying to write a generic executable to read a data file, and with user input, grab a certain array from an hdf5 file. ./my_data_analyzer -i my_data.h5 -other_args The structure of the files is f = tables.openFile('my_data.h5') f.root.childNode.childNode....dataArray where I do not know ahead of time how deep the childNodes go. My general strategy was to query the user (me and a colleague) which particular array we would like to analyze. So I was thinking to use raw_input to ask which node the user wants - but providing a list of possibilities for node in f.walkNodes(): print node if the user provides the full path to the data array - then read as numpy array - but if the user provides only partial path (maybe just the top level group) then query further until the user finds the full path, and finally read as numpy array. Also - given the data files, some of the groups (childNodes) likely won't obey the NaturalName conventions - in case that matters. 1) is my goal clear? 2) does anyone have a clever way to do this? some sort of recursive query with raw_input to get the desired data. Thanks, Andre |
From: Anthony S. <sc...@gm...> - 2012-06-06 18:53:30
|
On Wed, Jun 6, 2012 at 11:18 AM, Jon Wilson <js...@fn...> wrote: > Hi Anthony, > > Well I think the issue at hand is that you are trying to support two > disparate cases with one expression: sparse and dense selection. We have > tools for dealing with these cases individually and performing out-of-core > calculations. And if you know a priori which case you are going to fall > into, you can do the right thing. So without doing anything special, I > think medium-fast is probably the best and easiest thing that you can > expect right now. (Though I would be delighted to be proved wrong on this > point.) > > True enough. Sometimes I can't know anything a priori about the density > of the selection, of course. And I'm happy to worry about the internals, > but some of my colleagues, not so much ;) > I fully understand your position! > but I think the ideal would be to have a .where type query operator that >> returns Column objects or a Table object, with a "view" imposed in either >> case. >> > > We are very open to pull requests if you come up with an implementation > of this that you like more ;). > > Very fair. We'll see if I can get to it. Is there any sort of > guide-to-the-source to help me get started when and if that happens? I > guess just the reference guide? > Not really as such. There is this list, and then there is pyt...@go... for development specific questions and issues. > > I'll have a lot to learn before I can contribute usefully, I'm sure. I > don't know enough about the implementation details yet to know: would a > selection make the out-of-core performance gains from chunking and other > things moot because you'd have to skip around too much? > Basically, I would say that if you were to write a function that does what you described, we could take care of / help with the PyTables integration issues. You could start by looking at the code for Expr [1] and seeing if you could track down the one field issue. Or maybe there is a way to add this functionality easily... Don't hesitate to ask if you run into problems. Be Well Anthony > Regards, > Jon > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Jon W. <js...@fn...> - 2012-06-06 16:18:33
|
Hi Anthony, > Well I think the issue at hand is that you are trying to support two > disparate cases with one expression: sparse and dense selection. We > have tools for dealing with these cases individually > and performing out-of-core calculations. And if you know a priori > which case you are going to fall into, you can do the right thing. So > without doing anything special, I think medium-fast is probably the > best and easiest thing that you can expect right now. (Though I would > be delighted to be proved wrong on this point.) True enough. Sometimes I can't know anything a priori about the density of the selection, of course. And I'm happy to worry about the internals, but some of my colleagues, not so much ;) > > but I think the ideal would be to have a .where type query > operator that returns Column objects or a Table object, with a > "view" imposed in either case. > > > We are very open to pull requests if you come up with an > implementation of this that you like more ;). Very fair. We'll see if I can get to it. Is there any sort of guide-to-the-source to help me get started when and if that happens? I guess just the reference guide? I'll have a lot to learn before I can contribute usefully, I'm sure. I don't know enough about the implementation details yet to know: would a selection make the out-of-core performance gains from chunking and other things moot because you'd have to skip around too much? Regards, Jon |
From: Anthony S. <sc...@gm...> - 2012-06-06 16:07:49
|
On Wed, Jun 6, 2012 at 10:24 AM, Jon Wilson <js...@fn...> wrote: > Hi Anthony, > > > On 06/06/2012 12:45 AM, Anthony Scopatz wrote: > > > I think something like >> histogram(tables.Expr('col0 + col1**2', mytable.where('col2 > 15 & >> abs(col3) < 5')).eval()) >> would be ideal, but since where() returns a row iterator, and not >> something that I can extract Column objects from, I don't see any way to >> make it work. >> > > You are probably looking for the readWhere() method<http://pytables.github.com/usersguide/libref.html#tables.Table.readWhere> which > normally returns a numpy structured array. The line you are looking for is > thus: > > histogram(tables.Expr('col0 + col1**2', mytable.readWhere('col2 > 15 & > abs(col3) < 5')).eval()) > > This will likely be fast in both cases. I hope this helps. > > > Oddly, it doesn't work with tables.Expr, but does work with > numexpr.evaluate. In the case I talked about before with 7M rows, when > selecting very few rows, it does just fine (between the other two > solutions), but when selecting all rows, it is still about 2.75x slower > than the technique of using tables.Expr for both the histogram var and the > condition. > > I think that this is because .readWhere() pulls all the table rows > satisfying the where condition into memory first, and it furthermore does > so for all columns of all selected rows, so, for a table with many columns, > it has to read many times as much data into memory. > Yes that is correct, it does have to read the data into memory. > I can use the field parameter, but it only accepts one single field, so I > would have to perform the query once per variable used in the histogram > variable expression to do that. > > Using .readWhere() gives a medium-fast performance in both cases, but I > still feel like it is not quite the right thing because it reads the data > completely into memory instead of allowing the computation to be performed > out-of-core. Perhaps it is not really feasible, > Well I think the issue at hand is that you are trying to support two disparate cases with one expression: sparse and dense selection. We have tools for dealing with these cases individually and performing out-of-core calculations. And if you know a priori which case you are going to fall into, you can do the right thing. So without doing anything special, I think medium-fast is probably the best and easiest thing that you can expect right now. (Though I would be delighted to be proved wrong on this point.) > but I think the ideal would be to have a .where type query operator that > returns Column objects or a Table object, with a "view" imposed in either > case. > We are very open to pull requests if you come up with an implementation of this that you like more ;). Be Well Anthony > Regards, > Jon > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |