From: Francesc A. <fa...@gm...> - 2012-11-02 20:41:26
|
On 11/2/12 4:22 PM, Ben Elliston wrote: > My reading of the PyTables FAQ is that concurrent read access should > be safe with PyTables. However, when using a pool of worker processes > to read different parts of a large blosc-compressed CArray, I see: > > HDF5-DIAG: Error detected in HDF5 (1.8.8) thread 140476163647232: > #000: ../../../src/H5Dio.c line 174 in H5Dread(): can't read data > major: Dataset > minor: Read failed > #001: ../../../src/H5Dio.c line 448 in H5D_read(): can't read data > major: Dataset > minor: Read failed > etc. Hmm, now that I think, Blosc is not thread safe, and that can bring these sort of problems if you use it from several threads (but it should be safe when using several *processes*). In case your worker processes are threads, then it might help to deactivate threading in Blosc by setting the MAX_BLOSC_THREADS parameter: http://pytables.github.com/usersguide/parameter_files.html?#tables.parameters.MAX_BLOSC_THREADS to 1. HTH, -- Francesc Alted |