From: Cournapeau D. <cou...@at...> - 2005-09-16 02:54:00
|
Hi there, I would like to use pytables to read hdf5 files created by some C programs I wrote, mainly wrappers to matlab to use hdf5 with matlab. Using pytables, I was able to read vectors and matrices with pytables. Unfortunately, I cannot read vlen arrays: the vlen arrays are pretty simple, rank 1, fixed number of rows, each row is a variable size vector of double. To see where the problem is coming from, I created a similar ragged array with pytables, and compared the pytable generated file with my file. Not considering pytable metadata, I can see that there is only one difference between the data I wrote and the pytable file: pytable: DATASET "vlarray1" { DATATYPE H5T_VLEN { H5T_IEEE_F64LE} DATASPACE SIMPLE { ( 4 ) / ( H5S_UNLIMITED ) } DATA { (0): (5, 6), (5, 6, 7), (5, 6, 9, 8), (5, 6, 9, 10, 12) } } The pytable metadata were omitted. my file: DATASET "cell" { DATATYPE H5T_VLEN { H5T_IEEE_F64LE} DATASPACE SIMPLE { ( 2 ) / ( 2 ) } DATA { (0): (-0.432565, -1.66558, 0.125332), (0.287676, -1.14647) } } The last dimension is declared H5S_UNLIMITED in pytables, which is natural for the purpose of pytable. I haven't studied the architecture of pytables yet, but would it be difficult to support these kind of data ? If it is not too difficult, I would actually be interested in doing the support myself. To see if the problems goes away by declaring the dimension as H5S_UNLIMITED, I modified my C program, but the new file is impossible to open using pytables: there are some hdf5 library related problems. The error messages are the following: /usr/lib/python2.4/site-packages/tables/File.py:223: UserWarning: file ``unlimited_mex.h5`` exists and it is an HDF5 file, but it does not have a PyTables format; I will try to do my best to guess what's there using HDF5 metadata warnings.warn("""\ HDF5-DIAG: Error detected in HDF5 library version: 1.6.4 thread 3085332608. Back trace follows. #000: ../../../src/H5A.c line 457 in H5Aopen_name(): attribute not found major(18): Attribute layer minor(05): Bad value #001: ../../../src/H5A.c line 404 in H5A_get_index(): attribute not found major(18): Attribute layer minor(48): Object not found /usr/lib/python2.4/site-packages/tables/Group.py:252: UserWarning: problems loading leaf ``/cell``: flavor of type 'Float64' must be one of the "NumArray", "Numeric", "Tuple" or "List" values, and you tried to set it to "unknown". ; it will become an ``UnImplemented`` node warnings.warn("""\ Traceback (most recent call last): File "./readhd5.py", line 9, in ? fileh = openFile("unlimited_mex.h5", mode = "r") File "/usr/lib/python2.4/site-packages/tables/File.py", line 231, in openFile return File(path, mode, title, new, trMap, rootUEP, isPTFile, filters) File "/usr/lib/python2.4/site-packages/tables/File.py", line 411, in __init__ self.root = self.__getRootGroup(rootUEP) File "/usr/lib/python2.4/site-packages/tables/File.py", line 462, in __getRootGroup rootGroup = RootGroup(self, '/', rootUEP, new=self._v_new) File "/usr/lib/python2.4/site-packages/tables/Group.py", line 855, in __init__ self._g_openFile() File "/usr/lib/python2.4/site-packages/tables/Group.py", line 257, in _g_openFile objleaf._g_putUnder(objgroup, name) File "/usr/lib/python2.4/site-packages/tables/Leaf.py", line 223, in _g_putUnder super(Leaf, self)._g_putUnder(parent, name) File "/usr/lib/python2.4/site-packages/tables/Node.py", line 693, in _g_putUnder parent._g_refNode(self, ptname, validate) File "/usr/lib/python2.4/site-packages/tables/Group.py", line 327, in _g_refNode raise NodeError( tables.exceptions.NodeError: group ``/`` already has a child node named ``cell`` I would appreciate any hint to be able to read Thank you for your attention, David. P.S: I didn't find a link to the mailing list from the pytables webpage. Have I just missed the link ? Otherwise, it may be useful to put it, because finding the mailing list through sourceforge is a bit ackward... |
From: Francesc A. <fa...@ca...> - 2005-09-16 18:22:58
|
A Divendres 16 Setembre 2005 04:57, Cournapeau David va escriure: > The last dimension is declared H5S_UNLIMITED in pytables, which is > natural for the purpose of pytable. I haven't studied the architecture Yes, this is a possible source of problems with PyTables. > of pytables yet, but would it be difficult to support these kind of > data ? If it is not too difficult, I would actually be interested in > doing the support myself. Well, this is in fact a problem very similar to what lead Antonio Valentino to come up with the CArray object. EArray also enforces an enlargeable dimension in array, but he wanted a chunked array (in order to be able to compress it) with all dimensions to be *fixed*, and this is what CArray is. If you want to have a try, please, check carefully the job of Antonio and compare how the logic of CArray and EArray objects differs. Most of his approach should be useful to you. Also, I guess we should choose a new name for the new object. Perhaps CVLArray applies, but it seems to me a bit complicated. Perhaps a new parameter for VLArray would be enough (and better). > To see if the problems goes away by declaring the dimension as > H5S_UNLIMITED, I modified my C program, but the new file is impossible > to open using pytables: there are some hdf5 library related problems. > The error messages are the following: [trimmed] Uh, which version of PyTables are you using, the yesterday snapshot? In that case, please, use better a stable version, like 1.1.1 or 1.2b3, available in http://www.carabos.com/downloads/pytables/preliminary/ > Thank you for your attention, Keep us informed on your progress. Good luck! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Cournapeau D. <cou...@at...> - 2005-09-22 07:33:27
|
On Fri, 2005-09-16 at 20:22 +0200, Francesc Altet wrote: > > Uh, which version of PyTables are you using, the yesterday snapshot? > In that case, please, use better a stable version, like 1.1.1 or > 1.2b3, available in > > http://www.carabos.com/downloads/pytables/preliminary/ I was using a home made debian package using the sources from 1.1.1 (I had to build my own package because my distribution, ubuntu, does not include an up-to-date pytable package). I updated to 1.2b4 version, and the problem remains the same. Actually, if I have data which last dimension is declared as H5S_UNLIMITED, I can can read it. If I have an vlen array named cell, in the 'root' group, the first time I call root.cell, it does not work and generates this error. But playing a bit with it in python shell, I realized that calling root.cell a second time gives me the data ! I hope being able to track down the problem (I guess errors in hdf5 api use is considered as bug, right ?). Is there a simple way to quickly test changes in pyrex code ? I would like to avoid building a new deb package each time I modify the pyrex file to find and solve the bug :) (sorry if this sounds dumb, I've never used pyrex). David |
From: Francesc A. <fa...@ca...> - 2005-09-22 07:53:52
|
A Dijous 22 Setembre 2005 09:36, Cournapeau David va escriure: > I updated to 1.2b4 version, and the problem remains the same. Actually, > if I have data which last dimension is declared as H5S_UNLIMITED, I can > can read it. If I have an vlen array named cell, in the 'root' group, > the first time I call root.cell, it does not work and generates this > error. But playing a bit with it in python shell, I realized that > calling root.cell a second time gives me the data ! Strange. Anyway, I recommend you to stick with pytables 1.2b4, which is should be quite stable right now. > I hope being able to track down the problem (I guess errors in hdf5 api > use is considered as bug, right ?). Is there a simple way to quickly > test changes in pyrex code ? I would like to avoid building a new deb > package each time I modify the pyrex file to find and solve the bug :) > (sorry if this sounds dumb, I've never used pyrex). If you are going to modify the Pyrex sources, the best to rebuild the extensions is to use the distutils: python setup.py build_ext --inplace If you are making many modifications and need fast rebuilding, try to disable optimisation for Python extensions: =2D Edit /usr/lib/python2.3/config/Makefile (or python2.4) =2D Substitute: OPT=3D -DNDEBUG -g -O3 -Wall -Wstrict-prototypes by something like: OPT=3D -DNDEBUG -g Also, you don't need to re-install the package. Instead, put the root directory of your pytables development branch in your PYTHONPATH, and you are done. Luck! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Cournapeau D. <cou...@at...> - 2005-10-03 06:28:31
|
> > Strange. Anyway, I recommend you to stick with pytables 1.2b4, which > is should be quite stable right now. Ok, this is the version I am using right now, hence all problems reported here are against this version. > If you are going to modify the Pyrex sources, the best to rebuild the > extensions is to use the distutils: > [snip] Thanks for the information. Here is what I get. First, when I try to access my vlen array 'cell' through this small script (boilerplate code omitted): h5file = openFile(filename, "r") root = h5file.root print "=================== First call ====================" try: cell = root.cell except: print 'exception during first call' print "=================== Second call ====================" cell = root.cell for i in cell: print i the h5f file 'filename' I tried to read as reported by h5dump (only the interesting group is mentioned): DATASET "cell" { DATATYPE H5T_VLEN { H5T_IEEE_F64LE} DATASPACE SIMPLE { ( 2 ) / ( H5S_UNLIMITED ) } DATA { (0): (-0.432565, -1.66558, 0.125332), (0.287676, -1.14647) } } I then have the following error: HDF5-DIAG: Error detected in HDF5 library version: 1.6.4 thread 3084815008. Back trace follows. #000: ../../../src/H5A.c line 457 in H5Aopen_name(): attribute not found major(18): Attribute layer minor(05): Bad value #001: ../../../src/H5A.c line 404 in H5A_get_index(): attribute not found major(18): Attribute layer minor(48): Object not found /usr/src/misc/pytables/pytables-1.2b4/tables/Group.py:397: UserWarning: problems loading leaf ``/cell``:: flavor of type 'Float64' must be one of the "NumArray", "Numeric", "Tuple" or "List" values, and you tried to set it to "unknown". The leaf will become an ``UnImplemented`` node. The hf5 warning is coming from the call of H5LTget_attribute_string, in the function _openArray in the pyrex file /src/hdf5Extension.pyx. This of course happens because my file is not created by pytables, hence does not have any FLAVOR attribute. Is it intentional to keep the Hf5 warning on? Anyway, as the Flavor attribute does not exist, it is set to Unknown, and the the Atom function call fails in VLArray.py:VLArray._g_open. (specifically the checkflavor call in the Atom.__init__ fails). As I said in my previous email, the second attempt to read this vlen array succeeds, because pytables has it in memory, hence the File.py:File._getNode does not load it from the file, but from memory. But this means the first call, even if unsuccessfull, manages to read the vlen array in the file. (the data given back by the second call are correct). So I tried the following hack (in the file VLArray.py, function _g_open of the class VLArray): # First, check the special cases VLString and Object types if flavor == "VLString": self.atom = VLStringAtom() elif flavor == "Object": self.atom = ObjectAtom() elif stype == 'CharType': self.atom = StringAtom(self._atomicshape, self._basesize,flavor) elif stype == 'Enum': (enum, type_) = self._loadEnum() self.atom = EnumAtom(enum, type_, self._atomicshape, flavor) self._atomictype = type_ else: if flavor == 'unknown': self.atom = Atom(stype, self._atomicshape) else: self.atom = Atom(stype, self._atomicshape, flavor) return objectId Then, it works in my case. As I don't know much yet about pytable, I don't know what is the correct way achieve the same result. Concerning the case where the dimensions are fixed, I Unfortunately don't have the time right now to investigate it. David |
From: Francesc A. <fa...@ca...> - 2005-10-03 15:11:21
|
A Dilluns 03 Octubre 2005 08:31, Cournapeau David va escriure: > Anyway, as the Flavor attribute does not exist, it is set to Unknown, > and the the Atom function call fails in VLArray.py:VLArray._g_open. > (specifically the checkflavor call in the Atom.__init__ fails). Yes, you discovered a flaw in implementation. > But this means the first call, even if unsuccessfull, manages to read > the vlen array in the file. (the data given back by the second call are > correct). So I tried the following hack (in the file VLArray.py, > function _g_open of the class VLArray): [snipped] > Then, it works in my case. As I don't know much yet about pytable, > I don't know what is the correct way achieve the same result. Great, this looks like a good workaround. BTW, Ivan has recently implemented a new module to test compatibility with pure HDF5 files. Can you send us your file for inclusion in tests, please? That way, new modifications will always check your file. > Concerning the case where the dimensions are fixed, I Unfortunately > don't have the time right now to investigate it. Please, send me a sample of this too. I'll see what I can do. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Cournapeau D. <cou...@at...> - 2005-10-04 00:27:57
|
On Mon, 2005-10-03 at 17:11 +0200, Francesc Altet wrote: > Great, this looks like a good workaround. Ok. Would you like a diff against the last version ( 1.2-b6) ? > > BTW, Ivan has recently implemented a new module to test compatibility > with pure HDF5 files. Can you send us your file for inclusion in > tests, please? That way, new modifications will always check your > file. [snip] > Please, send me a sample of this too. I'll see what I can do. Should I send the the files to your carabos.com address ? (they are really small anyway, a few kb only). David |
From: Francesc A. <fa...@ca...> - 2005-10-04 08:11:31
|
A Dimarts 04 Octubre 2005 02:30, Cournapeau David va escriure: > On Mon, 2005-10-03 at 17:11 +0200, Francesc Altet wrote: > > Great, this looks like a good workaround. > > Ok. Would you like a diff against the last version ( 1.2-b6) ? No thanks. In fact, a patch is already applied in the last snapshot: http://www.carabos.com/downloads/pytables/snapshots/pytables-latest.tar.gz > BTW, Ivan has recently implemented a new module to test compatibility > Should I send the the files to your carabos.com address ? (they are > really small anyway, a few kb only). Yes, that's fine. Have in mind that we will try to add these files to the HDF5 compatibility test unit, so try to reduce the size to a bare minimum. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |