On Mon, Dec 13, 2010 at 2:20 PM, Francesc Alted <faltet@...> wrote:
> A Monday 13 December 2010 14:05:28 Dominik Szczerba escrigué:
> > On Mon, Dec 13, 2010 at 1:48 PM, Francesc Alted <faltet@...>
> > > As we know, HDF5 is ignorant on how the data in file is ordered.
> > > So, if you have created the dataset using a Fortran program, then
> > > clearly the data is ordered column-wise on disk. But, as you are
> > > reading the file by using a C-based app, then columns and rows
> > > will appear to be *transposed*.
> > >
> > > So, if what you want is to read column i *of your original Fortran
> > > array*, then the correct way to do this in PyTables should be:
> > >
> > > for i in range(NCELL):
> > > col = tetrahedrons[i,:]
> > This does not work. It only works as I wrote previously. Please see
> > below:
> > In : tets = array(fid.getNode("/tetrahedrons").read())
> > In : tets.shape
> > Out: (4, 4624802)
> > In : tets[:,0]
> > Out: array([715692, 707733, 707734, 159966], dtype=int32)
> > In : tets[0,:]
> > Out: array([715692, 365237, 555693, ..., 706208, 706208, 511217],
> > dtype=int32)
> > so tetrahedrons[i,:] runs 0..3 and not 0...NC-1
> > Did you make a typo above, or we do not arrive at a conclusion?
> That was not a typo, but a mistake on my part (I forgot that HDF5
> reverses the shape of the matrices when using the Fortran binding). So,
> yes, your version is okay for accessing columns.
> But, for knowing if accessing columns this is efficient for your case,
> I'd need more info on your datasets. Are they contiguous or chunked?
> If chunked, which is the chunkshape you have chosen?
Both. Files saved from matlab are uncompressed/contiguous, the ones saved
from my program are usually compressed/chunked and the size is around
Many thanks and regards,