From: Andre' Walker-L. <wal...@gm...> - 2012-08-15 23:52:10
|
Hi All, Just a strategy question. I have many hdf5 files containing data for different measurements of the same quantities. My directory tree looks like top description [ group ] sub description [ group ] avg [ group ] re [ numpy array shape = (96,1,2) ] im [ numpy array shape = (96,1,2) ] - only exists for know subset of data files I have ~400 of these files. What I want to do is create a single file, which collects all of these files with exactly the same directory structure, except at the very bottom re [ numpy array shape = (400,96,1,2) ] The simplest thing I came up with to do this is loop over the two levels of descriptive group structures, and build the numpy array for the final set this way. basic loop structure: final_file = tables.openFile('all_data.h5','a') for d1 in top_description: final_file.createGroup(final_file.root,d1) for d2 in sub_description: final_file.createGroup(final_file.root+'/'+d1,d2) data_re = np.zeros([400,96,1,2]) for i,file in enumerate(hdf5_files): tmp = tables.openFile(file) data_re[i] = np.array(tmp.getNode('/d1/d2/avg/re') tmp.close() final_file.createArray(final_file.root+'/'+d1+'/'+d2,'re',data_re) But this involves opening and closing the individual 400 hdf5 files many times. There must be a smarter algorithmic way to do this - or perhaps built in pytables tools. Any advice is appreciated. Andre |