Re: [Pytables-users] concurrent accesses

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 11/2/12 4:22 PM, Ben Elliston wrote:
> My reading of the PyTables FAQ is that concurrent read access should
> be safe with PyTables. However, when using a pool of worker processes
> to read different parts of a large blosc-compressed CArray, I see:
>
> HDF5-DIAG: Error detected in HDF5 (1.8.8) thread 140476163647232:
>    #000: ../../../src/H5Dio.c line 174 in H5Dread(): can't read data
>      major: Dataset
>      minor: Read failed
>    #001: ../../../src/H5Dio.c line 448 in H5D_read(): can't read data
>      major: Dataset
>      minor: Read failed
>    etc.

Hmm, now that I think, Blosc is not thread safe, and that can bring 
these sort of problems if you use it from several threads (but it should 
be safe when using several *processes*).  In case your worker processes 
are threads, then it might help to deactivate threading in Blosc by 
setting the MAX_BLOSC_THREADS parameter:

http://pytables.github.com/usersguide/parameter_files.html?#tables.parameters.MAX_BLOSC_THREADS

to 1.

HTH,

-- 
Francesc Alted