|
From: Alvaro T. C. <al...@mi...> - 2012-12-05 18:56:00
|
My system was benched for reads and writes with Blosc[1]:
with pt.openFile(paths.braw(block), 'r') as handle:
pt.setBloscMaxThreads(1)
%timeit a = handle.root.raw.c042[:]
pt.setBloscMaxThreads(6)
%timeit a = handle.root.raw.c042[:]
pt.setBloscMaxThreads(11)
%timeit a = handle.root.raw.c042[:]
print handle.root.raw._v_attrs.FILTERS
print handle.root.raw.c042.__sizeof__()
print handle.root.raw.c042
gives
1 loops, best of 3: 483 ms per loop
1 loops, best of 3: 782 ms per loop
1 loops, best of 3: 663 ms per loop
Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False)
104
/raw/c042 (CArray(303390000,), shuffle, blosc(5)) ''
I can't understand what is going on, for the life of me. These datasets use
int16 atoms and at Blosc complevel=5 used to compress by a factor of about
2. Even for such low compression ratios there should be huge differences
between single- and multi-threaded reads.
Do you have any clue?
-á.
[1] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks (first two
plots)
|