From: Valeriy S. <sok...@gm...> - 2013-07-18 09:06:11
|
Thank you, Anthony, I will try VLArray as you suggested =) On Thu, Jul 18, 2013 at 3:39 AM, Anthony Scopatz <sc...@gm...> wrote: > Hello Valeriy, > > For better or worse, the is exactly the performance I would expect. The > thing that you are running up against is that every HDF5 data set has 64 Kb > of header space for meta information. There is no way of changing this > without invalidating the HDF5 spec. The fact that you are seeing an > average of 70 Kb per data set is consistent since data sets don't need to > be contiguously stored. > > I would suggest that you use a VLArray [1] of length-1 string atoms. > You'll lose the filenode interface but you'll also loose the 3200% > overhead =). > > Be Well > Anthony > > 1. > http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-vlarray-class > > > On Wed, Jul 17, 2013 at 3:14 PM, Valeriy Sokolov <sok...@gm... > > wrote: > >> Not sure if the quoted message was delivered to the list (maybe because I >> was not registered on this list), so reposting it this way... >> >> On Fri, Jul 12, 2013 at 5:40 PM, Valeriy Sokolov < >> sok...@gm...> wrote: >> >>> Hi, >>> >>> I am trying to store lots of small (~2Kb) files in the filenode-s of the >>> pytables. And I ran into a trouble with size overhead. >>> >>> 200 such files which consumes in total ~2Mb on the filesystem takes 14Mb >>> in the .h5 file produced by pytables. My experiments show that if I create >>> 200 file nodes and store 1 byte in each, I have .h5 of 14Mb. Approximately >>> from the size like 200Kb per file node I have a linear increase of size. >>> I.e. 400Kb per node leads to 89Mb, and 800Kb per node leads to 164Mb. >>> >>> But I would like to store ~2Kb there and current overhead (like 70Kb per >>> file node) is pretty huge. >>> >>> Could you please help me with work-around for this issue? >>> >>> Thank you in advance. >>> >>> -- >>> Best regards, >>> Valeriy Sokolov. >>> >> >> >> >> -- >> Best regards, >> Valeriy Sokolov. >> >> >> ------------------------------------------------------------------------------ >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> >> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > -- Best regards, Valeriy Sokolov. |