From: Francesc A. <fa...@py...> - 2012-07-18 14:39:53
|
On 7/18/12 4:11 PM, Ümit Seren wrote: > Actually I had 30.000 groups in a parent group. > Each of the 30.000 groups had maybe 3 datasets. > So to be honest I never had 30.000 datasets in a single group. > I guess you will probably have to disable the LRU cache in that case right? Okay. So I'd say that having 30.000 entries (no matter if they are groups or datasets) would be a bad performance practice in general, but maybe it is a difference between groups and datasets (i.e. it affects more to datasets than groups)?. Just curious, PyTables did not complain when you created 30.000 groups in the same group? Regarding the LRU cache, no, I don't think this is the problem, but rather how HDF5 implements the 'inodes' (or whatever they call that). This is a big issue in general (inodes in filesystems have similar problems too), and what hurts performance in this case. -- Francesc Alted |