You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(2) |
Mar
|
Apr
(5) |
May
(11) |
Jun
(7) |
Jul
(18) |
Aug
(5) |
Sep
(15) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2004 |
Jan
(5) |
Feb
(2) |
Mar
(5) |
Apr
(8) |
May
(8) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
(20) |
Oct
(11) |
Nov
(31) |
Dec
(41) |
2005 |
Jan
(79) |
Feb
(22) |
Mar
(14) |
Apr
(17) |
May
(35) |
Jun
(24) |
Jul
(26) |
Aug
(9) |
Sep
(57) |
Oct
(64) |
Nov
(25) |
Dec
(37) |
2006 |
Jan
(76) |
Feb
(24) |
Mar
(79) |
Apr
(44) |
May
(33) |
Jun
(12) |
Jul
(15) |
Aug
(40) |
Sep
(17) |
Oct
(21) |
Nov
(46) |
Dec
(23) |
2007 |
Jan
(18) |
Feb
(25) |
Mar
(41) |
Apr
(66) |
May
(18) |
Jun
(29) |
Jul
(40) |
Aug
(32) |
Sep
(34) |
Oct
(17) |
Nov
(46) |
Dec
(17) |
2008 |
Jan
(17) |
Feb
(42) |
Mar
(23) |
Apr
(11) |
May
(65) |
Jun
(28) |
Jul
(28) |
Aug
(16) |
Sep
(24) |
Oct
(33) |
Nov
(16) |
Dec
(5) |
2009 |
Jan
(19) |
Feb
(25) |
Mar
(11) |
Apr
(32) |
May
(62) |
Jun
(28) |
Jul
(61) |
Aug
(20) |
Sep
(61) |
Oct
(11) |
Nov
(14) |
Dec
(53) |
2010 |
Jan
(17) |
Feb
(31) |
Mar
(39) |
Apr
(43) |
May
(49) |
Jun
(47) |
Jul
(35) |
Aug
(58) |
Sep
(55) |
Oct
(91) |
Nov
(77) |
Dec
(63) |
2011 |
Jan
(50) |
Feb
(30) |
Mar
(67) |
Apr
(31) |
May
(17) |
Jun
(83) |
Jul
(17) |
Aug
(33) |
Sep
(35) |
Oct
(19) |
Nov
(29) |
Dec
(26) |
2012 |
Jan
(53) |
Feb
(22) |
Mar
(118) |
Apr
(45) |
May
(28) |
Jun
(71) |
Jul
(87) |
Aug
(55) |
Sep
(30) |
Oct
(73) |
Nov
(41) |
Dec
(28) |
2013 |
Jan
(19) |
Feb
(30) |
Mar
(14) |
Apr
(63) |
May
(20) |
Jun
(59) |
Jul
(40) |
Aug
(33) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Cyril G. <cyr...@fr...> - 2005-09-13 09:06:56
|
Francesc Altet a =E9crit : >Yes. I've done some research and found this: > >http://hdf.ncsa.uiuc.edu/RFC/H5DimScales/ > >So, it seems that what you are asking for are Dimension Scales. Well, >the good news is that the HDF group is working on implementing this. >Once their job would be done, pytables can use them in a much more >portable way (i.e. compatible with all HDF5 apps). > =20 > Yes, it seems to be the solution. Apparently, there's also an API : http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/d= oc/ and perhaps stable. You know, I agre woth you, "tables" could be (is) the right way to work=20 with linked arrays even with structured grid . But, you know, people of the structured grids world don't use to work=20 like this. Do they all accept to change their mind ? I can't wait for this feature in pytables. Cyril. |
From: Francesc A. <fa...@ca...> - 2005-09-12 19:18:51
|
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Announcing PyTables 1.1.1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D This is a maintenance release of PyTables. On it, several optimizations and bug fixes has been made. As some of the fixed bugs were quite important, it's strongly recommended for users to upgrade. Go to the PyTables web site for downloading the beast: http://pytables.sourceforge.net/ or keep reading for more info about the improvements and bugs fixed. Changes more in depth =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Improvements: =2D Optimized the opening of files with a large number of objects. Now, files with table objects opens a 50% faster, and file with arrays opens more than twice as fast (up to 2000 objects/s on a Pentium 4@2GHz). Hence, a file with a combination of both kinds of objects, opens between a 50% and 100% faster than 1.1. =2D Optimized the creation of NestedRecArray objects using NumArray objects as columns, so that filling a table with Table.append() method achieves the similar performance than PyTables pre-1.1 releases. Bug fixes: =2D ``Table.readCoordinates()`` now converts the coords parameter into ``In= t64`` indices automatically. =2D Fixed a bug that prevented appending to tables (though Table.append) using a list of numarray objects. =2D Solved a small bug for creating indexes for first time in tables and retain the filter properties for posterior use. =2D Int32 attributes are dealed correctly in 64-bit platforms now. =2D Correction for accepting lists of numarrays as input for NestedRecArray= s. =2D Fixed problem creating rank 1 multi-dimensional string columns in ``Tab= le`` objects. Closes SF bug #1269023. =2D Avoid errors when unpickling objects stored in attributes. See the section ``AttributeSet`` in the reference chapter of the User's Manual for more information. Closes SF bug #1254636. =2D Assignment for *Array slices has been improved in order to solve some issues with shapes. Closes SF bug #1288792. Known bugs: =2D Classes inheriting from IsDescription subclasses do not inherit columns defined in the super-class. See SF bug #1207732 for more info. =2D Time datatypes are non-portable between big-endian and little-endian architectures. This is ultimately a consequence of a HDF5 limitation. See SF bug #1234709 for more info. Backward-incompatible changes: =2D None. Important note for MacOSX users =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D UCL compressor seems to work badly on MacOSX platforms. Until the problem would be isolated and eventually solved, UCL will not be compiled by default on MacOSX platforms, even if the installer finds it in the system. However, if you still want to get UCL support on MacOSX, you can use the --force-ucl flag in setup.py. Important note for Python 2.4 and Windows users =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D If you are willing to use PyTables with Python 2.4 in Windows platforms, you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET 2003. It can be found at: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-164-win-net.ZIP Users of Python 2.3 on Windows will have to download the version of HDF5 compiled with MSVC 6.0 available in: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-164-win.ZIP Share your experience =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. =2D--- **Enjoy data!** -- The PyTables Team =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: <phi...@ho...> - 2005-09-12 10:17:11
|
Hi Francesc, I have a silly question about creating hdf5 elements with =E9, =E8, =E0 c= aracters. For the moment, the file is created successfully but the element with=20 accented caracters isn't apperaing in the file group =3D h5file.createGroup("/", 'd=E9t=E9ctor', 'D=E9tector information= ') i try with: # -*- coding: "iso-8859-1" -*- and with creating a file sitecustomize.py with import sys sys.setdefaultencoding('iso-8859-1'). But the problem isn't figure out. Is this sort of caracters allowed in pytables API? Regards, Philippe Collet P.S: i have some work before trying to implement what we have planned in=20 our posts "Suggestions for storing mutlidimensionnal data in several=20 pytables arrays" but as soon as i'm ready, i'll do it. |
From: Francesc A. <fa...@ca...> - 2005-09-09 16:01:19
|
Hi List, The next public release of PyTables, namely, 1.1.1 has been made available in: http://www.carabos.com/downloads/pytables/preliminary/ (no binary versions for Windows available yet) Look at the announcement notes below. If I don't hear problems about it for a while, I'll announce it more widely. Also, a new beta version of 1.2 release (beta 2) is also available in case anybody wants to take it a try. It seems that the Windows issues has been resolved. It is also available in url stated above. Enjoy! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Announcing PyTables 1.1.1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D This is a maintenance release of PyTables. On it, several optimizations and bug fixes has been made. As some of the fixed bugs are quite important, it's strongly recommended for users to upgrade. Go to the PyTables web site for downloading the beast: http://pytables.sourceforge.net/ or keep reading for more info about the improvements and bugs fixed. Changes more in depth =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Improvements: =2D Optimized the opening of files with a large number of objects. Now, files with table objects opens a 50% faster, and file with arrays opens more than twice as fast (up to 2000 objects/s on a Pentium 4@2GHz). Hence, a file with a combination of both kinds of objects, opens between a 50% and 100% faster than 1.1. =2D Optimized the creation of NestedRecArray objects using NumArray objects as columns, so that filling a table with Table.append() method achieves the similar performance than PyTables pre-1.1 releases. Backward-incompatible changes: =2D None. Bug fixes: =2D ``Table.readCoordinates()`` now converts the coords parameter into ``In= t64`` indices automatically. =2D Fixed a bug that prevented appending to tables (though Table.append) using a list of numarray objects. =2D Solved a small bug for creating indexes for first time in tables and retain the filter properties for posterior use. =2D Int32 attributes are dealed correctly in 64-bit platforms now. =2D Correction for accepting lists of numarrays as input for NestedRecArray= s. =2D Fixed problem creating rank 1 multi-dimensional string columns in ``Tab= le`` objects. Closes SF bug #1269023. =2D Avoid errors when unpickling objects stored in attributes. See the section ``AttributeSet`` in the reference chapter of the User's Manual for more information. Closes SF bug #1254636. Known bugs: =2D Classes inheriting from IsDescription subclasses do not inherit columns defined in the super-class. See SF bug #1207732 for more info. =2D Time datatypes are non-portable between big-endian and little-endian architectures. This is ultimately a consequence of a HDF5 limitation. See SF bug #1234709 for more info. =2D--- **Enjoy data!** -- The PyTables Team |
From: Francesc A. <fa...@ca...> - 2005-09-09 15:29:47
|
A Friday 09 September 2005 17:14, Nicolas Girard va escriure: > thanks very much for your wise question : it was the cause of my troubles. > In my ~/.matplotloib/matplotlibrc I replaced > > numerix : Numeric > > with > > numerix : numarray > > and everything went fine. Great! > Pfieeew... that's a big relief to me ! To me too ;) Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Nicolas G. <nic...@ne...> - 2005-09-09 15:15:00
|
On Friday 09 September 2005 12:23, Francesc Altet wrote: > > Mmm..., that's very weird because you printed the results before and > they looked similar. Just in case, have you configured your > environment (I mean matplotlib or scipy, if applicable) so that it > expects numarray objects instead of expecting Numeric objects? Francesco, thanks very much for your wise question : it was the cause of my troubles. In my ~/.matplotloib/matplotlibrc I replaced numerix : Numeric with numerix : numarray and everything went fine. Pfieeew... that's a big relief to me ! Thanks again for your support ! cheers Nicolas |
From: Francesc A. <fa...@ca...> - 2005-09-09 10:23:26
|
A Thursday 08 September 2005 18:08, Nicolas Girard va escriure: > BUT the big trouble occurs when accessing several elements: > > In [205]:mean(d[1,:,0]) > Out[205]:1.2737342119216919 > > In [206]:mean(d2[1,:,0]) > Out[206]:0.48158463835716248 By the way, do you know which one is the correct one? > Also when plotting the data using matplotlib, the results are absolutely > different: > > In [202]:plot(d[1,0:5,0]) > Out[202]:[<matplotlib.lines.Line2D instance at 0x99d932c>] > --> http://nicolasgirard.net2.nerim.net/img/plot_d.png > > In [203]:clf() > > In [204]:plot(d2[1,0:5,0]) > Out[204]:[<matplotlib.lines.Line2D instance at 0x991958c>] > --> http://nicolasgirard.net2.nerim.net/img/plot_d2.png Mmm..., that's very weird because you printed the results before and they looked similar. Just in case, have you configured your environment (I mean matplotlib or scipy, if applicable) so that it expects numarray objects instead of expecting Numeric objects? =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-09-09 09:59:50
|
A Friday 09 September 2005 10:21, Cyril Giraudon va escriure: > That seems a very good news for the simulation methods based upon > structured grid (i, j, k) mesh like finite differences in time domain. > For instance in electromagnetism, people use to stock matrices > X[i], > Y[j], > Z[k], > Ex[i,j,k], > Ey[i,j,k], > Ez[i,j,k], > Hx[i,j,k], > Hy[i,j,k], > Hz[i,j,k] > independantly. > > Why ? Because during the computation, there is no need of X, Y, Z. All right, but if you include such information on tables *and* use compression, your datasets will not grow as much as you might think. I'd recommend to have a try on this. You may get (positively) surprised. > In pytables, one can write these matrices but there is no standard way > to say : > The first dimension of Ex is linking to X, the second to Y and the third > to Z. Yes. I've done some research and found this: http://hdf.ncsa.uiuc.edu/RFC/H5DimScales/ So, it seems that what you are asking for are Dimension Scales. Well, the good news is that the HDF group is working on implementing this. Once their job would be done, pytables can use them in a much more portable way (i.e. compatible with all HDF5 apps). > We could even imaging : > The first dimension of Ex is linking to X[i1:i2:i3] (begin, end, > step) and so on... In the design document on dimension scales above, you can go further and assign several (not just one) scales to a dimension, share scales among different dimensions (even on the same dataset), even define functions as scales would be possible. I don't know how difficult would be defining scale functions in compiled languages like C or =46ortran, but given the interpreted nature of Python that would be relatively easy. > And that 's the point, linking matrix sides to some arrays description. > "netcdf" allows such a feature but netcdf is usefullless with python, > and i like pytables, "very good job". > It's the way to do simulation with scripting languages. Thanks for the compliments :) =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-09-09 09:24:56
|
A Thursday 08 September 2005 18:08, Nicolas Girard va escriure: > I hope you'll have a clue on how to solve this problem, because this is > quite blocking (and tense) to me and I don't know much what to do > currently... Can you put your datafile accessible on Internet so that we can have a look on what is going on? Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Cyril G. <cyr...@fr...> - 2005-09-09 08:20:29
|
Hello World :-), That seems a very good news for the simulation methods based upon structured grid (i, j, k) mesh like finite differences in time domain. For instance in electromagnetism, people use to stock matrices X[i], Y[j], Z[k], Ex[i,j,k], Ey[i,j,k], Ez[i,j,k], Hx[i,j,k], Hy[i,j,k], Hz[i,j,k] independantly. Why ? Because during the computation, there is no need of X, Y, Z. In pytables, one can write these matrices but there is no standard way to say : The first dimension of Ex is linking to X, the second to Y and the third to Z. We could even imaging : The first dimension of Ex is linking to X[i1:i2:i3] (begin, end, step) and so on... And that 's the point, linking matrix sides to some arrays description. "netcdf" allows such a feature but netcdf is usefullless with python, and i like pytables, "very good job". It's the way to do simulation with scripting languages. Thanks for your work. Cyril. |
From: Nicolas G. <nic...@ne...> - 2005-09-08 16:08:50
|
Hi all, I'm experiencing big troubles when trying to read data from a dataset using pytables. The data was generated by a fortran program and stored using the HDF5 API into a dataset which has the following properties: - nb of dimensions: 3 - dimension sizes: 10x1024x1025 - data type: 32-bit floating point - chunking: 10x1024x1 - compression: shuffle: nbytes=4, gzip:level=1, allocation time: late Let g be the group to which the dataset d belongs. Also, let d2 be a supposed identical copy of d: In [182]:shape(d) Out[182]:(10L, 1024L, 1025L) In [183]:d2=d.read() In [185]:shape(d2) Out[185]:(10, 1024, 1025) In [186]:type(d) Out[186]:<class 'tables.EArray.EArray'> In [187]:type(d2) Out[187]:<class 'numarray.numarraycore.NumArray'> If I want to access the 2 arrays element by element, everything seems to be ok: In [189]:d[1,0:5,0] Out[189]:array([ 2.32650805, 2.32443571, 2.3218658 , 2.31947947, 2.31709099], type=Float32) In [190]:d2[1,0:5,0] Out[190]:array([ 2.32650805, 2.32443571, 2.3218658 , 2.31947947, 2.31709099], type=Float32) BUT the big trouble occurs when accessing several elements: In [205]:mean(d[1,:,0]) Out[205]:1.2737342119216919 In [206]:mean(d2[1,:,0]) Out[206]:0.48158463835716248 Also when plotting the data using matplotlib, the results are absolutely different: In [202]:plot(d[1,0:5,0]) Out[202]:[<matplotlib.lines.Line2D instance at 0x99d932c>] --> http://nicolasgirard.net2.nerim.net/img/plot_d.png In [203]:clf() In [204]:plot(d2[1,0:5,0]) Out[204]:[<matplotlib.lines.Line2D instance at 0x991958c>] --> http://nicolasgirard.net2.nerim.net/img/plot_d2.png I hope you'll have a clue on how to solve this problem, because this is quite blocking (and tense) to me and I don't know much what to do currently... Thanks in advance for any advice, cheers, Nicolas |
From: Francesc A. <fa...@ca...> - 2005-09-07 16:30:19
|
A Wednesday 07 September 2005 17:12, phi...@ho... va escriure: > >Ok. I understand better now. With that, what you want is something > >similar to: > > > >pressure.where(pressure.axis.longitude =3D=3D 0.2 and > > pressure.axis.latitude =3D=3D 0.45 ) > > > > That's exactly the idea behind all. And you go further with the > suggestion about the direct mapping into indices. That could be very > powerfull. Yes. I'm thinking that this kind of approach could let the user do: pressure.where(pressure.axis.longitude > 0.2 and pressure.axis.latitude < 0.45 ) or=20 pressure.where(0.1 <=3D pressure.axis.longitude <=3D 0.2 or pressure.axis.latitude < 0.45 ) or any other boolean combination. In addition, as HDF5 supports natively boolean combinations of hyperslabs, this kind of operations will potentially be very I/O efficient. > Thanks a lot for taking in account new improvements. > Of course, i can give you a hand if you need help. > Tell me when you start programming it, we can work together. > If you didn't have time, i will try soon to modify pytables code with > your rigths. Well, we are working in other things right now, but indeed I'd like to have some spare time to implement this. Of course, you can always have a deeper look into it and if you end with something I'll be more than happy to include it in PyTables. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: <phi...@ho...> - 2005-09-07 15:14:09
|
Francesc Altet wrote: >A Tuesday 06 September 2005 16:08, phi...@ho... va escriure: > > >>Sorry that doesn't seem clear. >>No limitations of pytables. >>It's much more that i'm not allowed to use table. >>Arrays are more appreciated to achieve my goal . >> >> > >Ok. But it's a pity because using the filtering capabilities in Table >objects, like: > >[r['pressure'] for r in table if r['latitude'] == X and r['latitude'] == Y] > >is very powerful to do what you are willing to. > > Of course, using python list comprehension in tables is really powerfull. But i can't actually use tables. I have to use arrays. > > >>That could be a good starting point. I only need an ordered list >>containing path of each array of system parameters. >>For example, the attribute could be >>['/parameters/longitude','/parameters/latitude','/parameters/time'] >> >> > >Definitely, that could be a solution if you want to stick with arrays. > > > >>In fact, i'm joining Norbert Nemec in his post from 2005-01-24 11:51 to >>request a convention in pytables to store not only data but also a way >>to link the data. >> >>Here is thye situation, better represented: >> >> >[snipped] > >Ok. I understand better now. With that, what you want is something >similar to: > >pressure.where(pressure.axis.longitude == 0.2 and > pressure.axis.latitude == 0.45 ) > >isn't it? > >Well, I must recognize that the idea is pretty nice. In addition, if >values in latitude and longitude have a direct mapping into indices >(as you seem to suggest), then a binary search on axis could be made >so that finding the indexes should be a very fast operation. > >Well, I think I'll put this suggestion in our TODO file. > >Thanks for your detailed explanation! > > That's exactly the idea behind all. And you go further with the suggestion about the direct mapping into indices. That could be very powerfull. Thanks a lot for taking in account new improvements. Of course, i can give you a hand if you need help. Tell me when you start programming it, we can work together. If you didn't have time, i will try soon to modify pytables code with your rigths. Thanks a lot for your work. Regards, Philippe Collet |
From: travlr <vel...@gm...> - 2005-09-07 11:45:52
|
Hi Fransesc, I hope you all had a very nice vaction :-) On 9/7/05, Francesc Altet <fa...@ca...> wrote: > Hi Travlr, >=20 > A Wednesday 07 September 2005 09:32, travlr va escriure: > > Today, I went back and got the "selecting and getting" Table <rows> and > > Column <elements> for versions 1.0 and 1.1 working again, The attachmen= ts > > provided are a tables.Table.dif file for each version. > > Below is an example of it's usage. >=20 > Thanks for the patches. I think is a good idea to allow retrieving > values using an index array. I'll apply the patches and hopefully they > will appear in the forthcoming PyTables 1.2. I hope you're also able to include setting values (and other functionality mentioned) as well. I was able to implement "setting" initially, but was not able to reproduce it with these patches. >=20 > > Also Fransesc, two questions: > > 1) Could the tables.Table.Column also have an .nrows (nelements) attrib= ute? >=20 > Well, you have access to the table object via the .table attribute. So > you can use columnObject.table.nrows. >=20 > > 2) Once bounded, tables.Table used to return as a <class > > 'numarray.records.RecArray'> object. Now it returns a <class > > 'tables.nestedrecords.NestedRecArray'> object....Is this necessary?. >=20 > Strictly speaking no, but from PyTables 1.1 on, Table objects supports > nested fields, so we have been forced to introduce such a > NestedRecArray object. This is by not means a brand new object, but it > is strongly based on RecArray (in fact, it inherits from RecArray). > If, for whatever reason, you prefer a RecArray object, you have a > couple of possibilities: >=20 > 1.- Use asRecArray() method. See: > http://pytables.sourceforge.net/html-doc/usersguide8.html#sectionB.2 > for more details. >=20 > 2.- If your table has not nested fields, your NestedRecArray will be > flat, and you can access the underlying RecArray object by just > getting the object pointed by the ._flatArray attribute. Actually, I really don't have a direct concern about this. I wish my understandings (and free time) were greater, because I'd like to offer what I can. In the near future my time will be more free, and maybe I can at least do a little something with an improvement of the documentation... I would have loved to have had an hyperlinked index back when I was familiarizing myself with pyTables. When I finish with my immediate responsibilities, I will offer again. Thanks Pete > Cheers, >=20 > -- > >0,0< Francesc Altet http://www.carabos.com/ > V V C=E1rabos Coop. V. Enjoy Data > "-" >=20 > |
From: Francesc A. <fa...@ca...> - 2005-09-07 09:35:13
|
Hi Travlr, A Wednesday 07 September 2005 09:32, travlr va escriure: > Today, I went back and got the "selecting and getting" Table <rows> and > Column <elements> for versions 1.0 and 1.1 working again, The attachments > provided are a tables.Table.dif file for each version. > Below is an example of it's usage. Thanks for the patches. I think is a good idea to allow retrieving values using an index array. I'll apply the patches and hopefully they will appear in the forthcoming PyTables 1.2. > Also Fransesc, two questions: > 1) Could the tables.Table.Column also have an .nrows (nelements) attribut= e? Well, you have access to the table object via the .table attribute. So you can use columnObject.table.nrows. > 2) Once bounded, tables.Table used to return as a <class > 'numarray.records.RecArray'> object. Now it returns a <class > 'tables.nestedrecords.NestedRecArray'> object....Is this necessary?. Strictly speaking no, but from PyTables 1.1 on, Table objects supports nested fields, so we have been forced to introduce such a NestedRecArray object. This is by not means a brand new object, but it is strongly based on RecArray (in fact, it inherits from RecArray). If, for whatever reason, you prefer a RecArray object, you have a couple of possibilities: 1.- Use asRecArray() method. See: http://pytables.sourceforge.net/html-doc/usersguide8.html#sectionB.2 for more details. 2.- If your table has not nested fields, your NestedRecArray will be flat, and you can access the underlying RecArray object by just getting the object pointed by the ._flatArray attribute. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: travlr <vel...@gm...> - 2005-09-07 07:32:21
|
Hi all, In a previous thread, I discussed with Fransesc about being able to incorporate non-sequential indexing arrays (key arrays) in the following ways: select and get (or set values) tables.Table.Table <rows>=20 select and get (or set values) tables.Table.Column <items> select and get (or set values) tables.Array <items> via: object[key] =3D value(s) I was able to use the patch Fransesc had provided, and also (with my extremely limited expertise) a little more including using Numeric arrays and Python lists for the keys (all getting/setting intended except enabling the tables.Array[key] functionality). Unfortunately, I at some point accidently erased the damn file :-( Today, I went back and got the "selecting and getting" Table <rows> and Column <elements> for versions 1.0 and 1.1 working again, The attachments provided are a tables.Table.dif file for each version. Below is an example of it's usage. I was not able to do any more especially for the newer versions. I am hoping for this functionality, as well as being able to set the values via... keys of a non-sequential index_array: tables.Table[index_array] =3D value(s); and tables.Column[index_array] =3D value(s); and tables.Array[index_array] =3D value(s) Also Fransesc, two questions: 1) Could the tables.Table.Column also have an .nrows (nelements) attribute? 2) Once bounded, tables.Table used to return as a <class 'numarray.records.RecArray'> object. Now it returns a <class 'tables.nestedrecords.NestedRecArray'> object....Is this necessary?. Thank You ############# EXAMPLE ########### import tables as ta import numarray as na # test_file is one of my actual files (renamed) hfile =3D ta.openFile('~/data/h5/test_folder/test_file.h5','a') # These are pointers to tables.Table=20 # objects (not bounded or returned) mtable =3D hfile.root.d_050808.d_050808 mcolumn =3D mtable.cols.A1 # mtable is a tables.Table.Table of a=20 # particular day in my database. print print 'type(mtable)........ ',type(mtable) print 'mtable.nrows....... ',mtable.nrows print 'type(mcolumn).... ',type(mcolumn) # NOTE to Francesc: Why no nrows (nelements) for columns? #### print 'mcolumn.nrows... ',mcolumn.nrows # This will return an non-sequential index=20 # smaller than the table's (or column's)=20 # original index index =3D na.where(mcolumn[:]=3D=3D2)[0] print print 'index.size....... ', index.size() # The non-sequential index is applied # To the tables.Table and and tables.Table.Column # and returns a numarray.records object (or an array object=20 # for the column) with the specified indexing rtable =3D mtable[index] rcolumn =3D mcolumn[index] print print 'type(rtable)..... ',type(rtable) print 'rtable.size()..... ',rtable.size() print 'rcolumn.size()..',rcolumn.size() hfile.close() ############# RESULT ########### /usr/bin/python -u "~/scripts/september/testForPytables.py" type(mtable)........ <class 'tables.Table.Table'> mtable.nrows....... 105983 type(mcolumn).... <class 'tables.Table.Column'> index.size....... 1172 type(rtable)..... <class 'tables.nestedrecords.NestedRecArray'> rtable.size()..... 1172 rcolumn.size().. 1172 |
From: Francesc A. <fa...@ca...> - 2005-09-06 15:53:01
|
A Tuesday 06 September 2005 16:08, phi...@ho... va escriure: > Sorry that doesn't seem clear. > No limitations of pytables. > It's much more that i'm not allowed to use table. > Arrays are more appreciated to achieve my goal . Ok. But it's a pity because using the filtering capabilities in Table objects, like: [r['pressure'] for r in table if r['latitude'] =3D=3D X and r['latitude'] = =3D=3D Y] is very powerful to do what you are willing to. > That could be a good starting point. I only need an ordered list > containing path of each array of system parameters. > For example, the attribute could be > ['/parameters/longitude','/parameters/latitude','/parameters/time'] Definitely, that could be a solution if you want to stick with arrays. > In fact, i'm joining Norbert Nemec in his post from 2005-01-24 11:51 to > request a convention in pytables to store not only data but also a way > to link the data. > > Here is thye situation, better represented: [snipped] Ok. I understand better now. With that, what you want is something similar to: pressure.where(pressure.axis.longitude =3D=3D 0.2 and pressure.axis.latitude =3D=3D 0.45 ) isn't it? Well, I must recognize that the idea is pretty nice. In addition, if values in latitude and longitude have a direct mapping into indices (as you seem to suggest), then a binary search on axis could be made so that finding the indexes should be a very fast operation. Well, I think I'll put this suggestion in our TODO file. Thanks for your detailed explanation! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: <phi...@ho...> - 2005-09-06 14:08:49
|
Francesc Altet wrote: >El dv 02 de 09 del 2005 a les 17:28 +0200, en/na >phi...@ho... va escriure: > > >>Hi Francesc and all others pytables users/developpers, >> >>I have structured data: atmospheric pressure which depends from >>latitute, longitude and time. >>The problem is i can't use pytables Table due to the application >>capabilities involved to manage the file. >> >> > >Uh, you mean limitations specific of pytables or other kind of >limitations? > > > Sorry that doesn't seem clear. No limitations of pytables. It's much more that i'm not allowed to use table. Arrays are more appreciated to achieve my goal . >>The unique solution i have is to store each data in an array (pressure >>is a multidimensionnal array, time is an array, longitude is an array >>and latitude is an array). >> >>Then i have to use a specific attributes to retrieve the right >>assiocation between parameters and pressure. >> >>1. Is there a possibility to put a multidimensionnal attributes, which >>can consist in the list containing the path of the parameters >>([/longitude,/latitute,/time]) ? >>So i know the first dimension of the pressure array is the longitude, >>the second the latitude, the third the time and the last one the values >>of the pressure. >> >> > >I don't know exactly what you are trying to do, but, in general, you can >save any kind of python object as an attribute. However, if this object >has not a direct mapping with a multimensional (or scalar) atomic (i.e. >byte, char, int or float) HDF5 type, then it's pickled. > > That could be a good starting point. I only need an ordered list containing path of each array of system parameters. For example, the attribute could be ['/parameters/longitude','/parameters/latitude','/parameters/time'] > > >>2. Would it be out of context to think about a standard way to link >>inside pytables, the dimension of a multidimensionnal array to the array >>defining the parameters values comparably to Netcdf ? >> >> > >Can you be a bit more explicit on what you want exactly? > > In fact, i'm joining Norbert Nemec in his post from 2005-01-24 11:51 to request a convention in pytables to store not only data but also a way to link the data. Here is thye situation, better represented: Longitude [0.1 0.2 0.3 0.4] Latitude [0.25 0.35 0.45 0.55] latitude ^ Pressure : [[1, 1.25 , 1.78 , 1.4], | (0.25) [3.1 , 2.4 , 1.89 , 8.2], | (0.35) [4.5 , 7.9 , 4.63 , 1], | (0.45) [1.2 , 3.7 , 4.26 , 6]] | (0.55) | longitude < --------------------------------- (0.1) (0.2) (0.3) (0.4) If i am able to know that pressure atmospheric data are store depending first from the latitude then the longitude, i can easily get the pressure given a longitude and a latitude. For example : i need the pressure for longitude 0.2 and latitude 0.45. Then pressure is 7.9. I need to build this thing with pytables. Longitude is an array of shape (4L,), latitude is an array of shape (4L,) and pressure is an array of shape (4L,4L). For the moment, with pytables, there is no system attribute(s) to know first dimension in pressure shape is latitude and second is longitude. Can you give me a something (a specific attribute for example) available in the pytables API, to declare the order of use of each parameters in the pressure array. Thanks a lot, Regards, Philippe Collet >Cheers, > > > |
From: Francesc A. <fa...@ca...> - 2005-09-05 09:04:50
|
El ds 03 de 09 del 2005 a les 15:04 +0200, en/na Nicolas Girard va escriure: > Hi, > when calling close() on an hdf file that wasn't flushed before, I would h= ave=20 > expected that the file was flushed but it doesn't seem to be the case=20 > currently.=20 >=20 > What do you think about it ? Should close() call flush() if needed ? Indeed. A call to close() should do a flush() all the way. Can you send an example of the code that doesn't work as expected? Also, it will help to know which version of pytables are you using and platform. Cheers, --=20 >0,0< Francesc Altet http://www.carabos.com/ V V C=E1rabos Coop. V. Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-09-05 08:59:13
|
El dv 02 de 09 del 2005 a les 21:38 +0200, en/na Francesco Del Degan va escriure: > For a table, composed by 2 columns, an integer and a float, i've reached > 367KRows/s... very good! Yeah, this is more or less the kind of performance that you can expect from PyTables. > But performance on chararrays is very poor, in comparison to numeric ones= . >=20 > Adding a CharArray of 10^6 elements of 1 byte drop the performance to > 100Krows/s, and add two CharArray of byte, > drop to 30Krows/s. It also seems independent by maximum string length. Yes, in experience, including strings in tables drops the append performance significantly. I don't know were is exactly the problem, but if you are willing to profile this case in order to see where the hot spots are, that would be really interesting. Cheers, --=20 >0,0< Francesc Altet http://www.carabos.com/ V V C=E1rabos Coop. V. Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-09-05 08:53:38
|
El dv 02 de 09 del 2005 a les 17:28 +0200, en/na phi...@ho... va escriure: > Hi Francesc and all others pytables users/developpers, >=20 > I have structured data: atmospheric pressure which depends from=20 > latitute, longitude and time. > The problem is i can't use pytables Table due to the application=20 > capabilities involved to manage the file. Uh, you mean limitations specific of pytables or other kind of limitations? >=20 > The unique solution i have is to store each data in an array (pressure=20 > is a multidimensionnal array, time is an array, longitude is an array=20 > and latitude is an array). >=20 > Then i have to use a specific attributes to retrieve the right=20 > assiocation between parameters and pressure. >=20 > 1. Is there a possibility to put a multidimensionnal attributes, which=20 > can consist in the list containing the path of the parameters=20 > ([/longitude,/latitute,/time]) ? > So i know the first dimension of the pressure array is the longitude,=20 > the second the latitude, the third the time and the last one the values=20 > of the pressure. I don't know exactly what you are trying to do, but, in general, you can save any kind of python object as an attribute. However, if this object has not a direct mapping with a multimensional (or scalar) atomic (i.e. byte, char, int or float) HDF5 type, then it's pickled. > 2. Would it be out of context to think about a standard way to link=20 > inside pytables, the dimension of a multidimensionnal array to the array=20 > defining the parameters values comparably to Netcdf ? Can you be a bit more explicit on what you want exactly? Cheers, --=20 >0,0< Francesc Altet http://www.carabos.com/ V V C=E1rabos Coop. V. Enjoy Data "-" |
From: Nicolas G. <nic...@ne...> - 2005-09-03 13:04:37
|
Hi, when calling close() on an hdf file that wasn't flushed before, I would have expected that the file was flushed but it doesn't seem to be the case currently. What do you think about it ? Should close() call flush() if needed ? Cheers, Nicolas |
From: Francesco D. D. <ke...@li...> - 2005-09-02 19:39:18
|
Francesc Altet wrote: >Hi Francesco, > >This problem is related with slowness of element-by-element assignment >in numarray objects. If you want to achieve big performance for writing >PyTables, it is better that you use the Table.append method (instead of >Row.append). > >I normally use the next code: > > def fill_arrays(self, start, stop): > "Some generic filling function" > arr_f8 = numarray.arange(start, stop, type=numarray.Float64) > arr_i4 = numarray.arange(start, stop, type=numarray.Int32) > if self.userandom: > arr_f8 += random_array.normal(0, stop*self.scale, > shape=[stop-start]) > arr_i4 = numarray.array(arr_f8, type=numarray.Int32) > return arr_i4, arr_f8 > > def fill_table(self, con): > "Fills the table" > table = con.root.table > j = 0 > for i in xrange(0, self.nrows, self.step): > stop = (j+1)*self.step > if stop > self.nrows: > stop = self.nrows > arr_i4, arr_f8 = self.fill_arrays(i, stop) > recarr = records.fromarrays([arr_i4, arr_f8]) > table.append(recarr) > j += 1 > table.flush() > >in order to fill a table with two columns (Int32 and Float32). > >If you try this, I'm sure you will get much better results. > > Yes, if i build numarray.arrays for single columns, then build a recarray and append all, the performance increase of a factor of 10th, but only when i use numeric values. For a table, composed by 2 columns, an integer and a float, i've reached 367KRows/s... very good! But performance on chararrays is very poor, in comparison to numeric ones. Adding a CharArray of 10^6 elements of 1 byte drop the performance to 100Krows/s, and add two CharArray of byte, drop to 30Krows/s. It also seems independent by maximum string length. I'm builing arrays from lists with: numarray.array(list, shape=n of Rows, type=type of Row) for numeric values numarray.strings.array(list of strings, shape = n of Rows, itemsize = maxLength of row) for strings. Is a memory move issue? Thanks, FrancescoDD |
From: <phi...@ho...> - 2005-09-02 15:28:32
|
Hi Francesc and all others pytables users/developpers, I have structured data: atmospheric pressure which depends from latitute, longitude and time. The problem is i can't use pytables Table due to the application capabilities involved to manage the file. The unique solution i have is to store each data in an array (pressure is a multidimensionnal array, time is an array, longitude is an array and latitude is an array). Then i have to use a specific attributes to retrieve the right assiocation between parameters and pressure. 1. Is there a possibility to put a multidimensionnal attributes, which can consist in the list containing the path of the parameters ([/longitude,/latitute,/time]) ? So i know the first dimension of the pressure array is the longitude, the second the latitude, the third the time and the last one the values of the pressure. I know there were a discussion and a patch about this but i don't know how to use it. 2. Would it be out of context to think about a standard way to link inside pytables, the dimension of a multidimensionnal array to the array defining the parameters values comparably to Netcdf ? Thanks a lot for your answers, regards, Philippe Collet |
From: Francesc A. <fa...@ca...> - 2005-09-02 14:58:33
|
Hi Francesco, This problem is related with slowness of element-by-element assignment in numarray objects. If you want to achieve big performance for writing PyTables, it is better that you use the Table.append method (instead of Row.append). I normally use the next code: def fill_arrays(self, start, stop): "Some generic filling function" arr_f8 =3D numarray.arange(start, stop, type=3Dnumarray.Float64) arr_i4 =3D numarray.arange(start, stop, type=3Dnumarray.Int32) if self.userandom: arr_f8 +=3D random_array.normal(0, stop*self.scale, shape=3D[stop-start]) arr_i4 =3D numarray.array(arr_f8, type=3Dnumarray.Int32) return arr_i4, arr_f8 def fill_table(self, con): "Fills the table" table =3D con.root.table j =3D 0 for i in xrange(0, self.nrows, self.step): stop =3D (j+1)*self.step if stop > self.nrows: stop =3D self.nrows arr_i4, arr_f8 =3D self.fill_arrays(i, stop) recarr =3D records.fromarrays([arr_i4, arr_f8]) table.append(recarr) j +=3D 1 table.flush() in order to fill a table with two columns (Int32 and Float32). If you try this, I'm sure you will get much better results. Cheers, El dv 02 de 09 del 2005 a les 16:09 +0200, en/na Francesco Del Degan va escriure: > Hi, i have an issue with pytables performance: >=20 > This is my python code for testing: >=20 > ---SNIP--- > from tables import * >=20 > class PytTest(IsDescription): > string =3D Col('CharType', 16) > id =3D Col('Int32', 1) > float =3D Col('Float64', 1) >=20 > h5file =3D openFile('probe.h5','a') >=20 > try: > testGroup =3D h5file.root.testGroup > except NoSuchNodeError: > testGroup =3D h5file.createGroup( > "/", "testGroup", "Test Group") > try: > tbTest =3D testGroup.test > except NoSuchNodeError: > tbTest =3D h5file.createTable( > testGroup, > 'test', > PytTest, > 'Test table') >=20 > import time >=20 > maxRows =3D 10**6 >=20 > ### TEST1 ### > startTime =3D time.time() > row =3D tbTest.row > for i in range(0, maxRows): > row['string'] =3D '1234567890123456' > row['id'] =3D 1 > row['float'] =3D 1.0/3.0 > row.append() > tbTest.flush() > diffTime =3D time.time()-startTime > print 'test1: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, > maxRows/diffTime) >=20 > ### TEST2 ### > startTime =3D time.time() > row =3D tbTest.row > for i in range(0, maxRows): > row['string'] =3D '1234567890123456' > row['id'] =3D 1 > row['float'] =3D 1.0/3.0 > diffTime =3D time.time()-startTime > print 'test2: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, > maxRows/diffTime) >=20 > ### TEST3 ### > startTime =3D time.time() > row =3D tbTest.row > row['string'] =3D '1234567890123456' > row['id'] =3D 1 > row['float'] =3D 1.0/3.0 > for i in range(0, maxRows): > row.append() > tbTest.flush() > diffTime =3D time.time()-startTime > print 'test3: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, > maxRows/diffTime) > h5file.close() >=20 > ---SNIP--- >=20 > This code try to insert maxRows (10**6) into a table. The table is > similar at table in: > http://pytables.sourceforge.net/doc/PyCon.html#section4 (small table) > used for benchmarking >=20 > class Small(IsDescription): > var1 =3D Col("CharType", 16) > var2 =3D Col("Int32", 1) > var3 =3D Col("Float64", 1) >=20 > As you'll notice, there are 3 possible tests: > TEST 1: creation of rows and append() in loop > TEST 2: creation of rows in loop, no append (no disk use) > TEST 3: creation of row before loop, and append in loop() >=20 > flush is always out of loop, at the end. >=20 > The testbed is an AMD Athlon(tm) 64 Processor 2800+, 1GB Ram, and > 5400rpm disk > I've seen same results on a Dual XEON machine, 1GB Ram, SCSI disk. >=20 > testbed:~# python test.py > =20 > test1: 1000000 rows in 22.7905650139 seconds (43877.8064252/s) > test2: 1000000 rows in 20.3718218803 seconds (49087.4113211/s) > test3: 1000000 rows in 2.01304578781 seconds (496759.68925/s) >=20 > that troughput (40-50krows/s) is less (10 times circa) than that in > http://pytables.sourceforge.net/doc/PyCon.html#section4 (small table) >=20 > seems that the row assignment: =20 >=20 > row[fieldName] =3D value >=20 > took huge amount of time and that time for writing to disk is 10 times > smaller than assignation. > I'm doing someting wrong? >=20 > I've made some test on source code, and i've realized that, on > TableExtension.pyx on > __setitem__ of Row (called when i do a row[...] =3D value) the line: >=20 > self._wfields[fieldName][self._unsavednrows] =3D value >=20 > is responsible for that slowness. >=20 > self._wfields[fieldName] is a numarray.array, isnt't? Assignment took so > much time > related to disk? >=20 > I can do a strace of process if you need. >=20 > I've tried with pytables 1.1 and 1.2-b1 compiled from source, > and numarray 1.1.1, 1.3.2, 1.3.3 compiled from source with same results >=20 > It's a normal beaviour, on your opinion? >=20 > Thanks in advance, > kesko78 >=20 >=20 > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practic= es > Agile & Plan-Driven Development * Managing Projects & Teams * Testing & Q= A > Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users --=20 >0,0< Francesc Altet http://www.carabos.com/ V V C=E1rabos Coop. V. Enjoy Data "-" |