From: Alvaro T. C. <al...@mi...> - 2012-12-05 18:56:00
|
My system was benched for reads and writes with Blosc[1]: with pt.openFile(paths.braw(block), 'r') as handle: pt.setBloscMaxThreads(1) %timeit a = handle.root.raw.c042[:] pt.setBloscMaxThreads(6) %timeit a = handle.root.raw.c042[:] pt.setBloscMaxThreads(11) %timeit a = handle.root.raw.c042[:] print handle.root.raw._v_attrs.FILTERS print handle.root.raw.c042.__sizeof__() print handle.root.raw.c042 gives 1 loops, best of 3: 483 ms per loop 1 loops, best of 3: 782 ms per loop 1 loops, best of 3: 663 ms per loop Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False) 104 /raw/c042 (CArray(303390000,), shuffle, blosc(5)) '' I can't understand what is going on, for the life of me. These datasets use int16 atoms and at Blosc complevel=5 used to compress by a factor of about 2. Even for such low compression ratios there should be huge differences between single- and multi-threaded reads. Do you have any clue? -á. [1] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks (first two plots) |