Re: [Pytables-users] Faster Performance: A set of nodes vs A new column that ranges within a set?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Actually it did complain that it is over a certain limit and it also
suggested a flag with which I can turn off the warning. But performance
seemed fine. So if I randomly accessed any of the 30.000 groups I got the
group handle in a fraction of a second
Am 18.07.2012 16:40 schrieb "Francesc Alted" <fa...@py...>:

> On 7/18/12 4:11 PM, Ümit Seren wrote:
> > Actually I had 30.000 groups in a parent group.
> > Each of the 30.000 groups had maybe 3 datasets.
> > So to be honest I never had 30.000 datasets in a single group.
> > I guess you will probably have to disable the LRU cache in that case
> right?
>
> Okay.  So I'd say that having 30.000 entries (no matter if they are
> groups or datasets) would be a bad performance practice in general, but
> maybe it is a difference between groups and datasets (i.e. it affects
> more to datasets than groups)?.  Just curious, PyTables did not complain
> when you created 30.000 groups in the same group?
>
> Regarding the LRU cache, no, I don't think this is the problem, but
> rather how HDF5 implements the 'inodes' (or whatever they call that).
> This is a big issue in general (inodes in filesystems have similar
> problems too), and what hurts performance in this case.
>
> --
> Francesc Alted
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>