Re: [Pytables-users] PyTables and Multiprocessing

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Le 11/07/2013 21:56, Anthony Scopatz a écrit :
>
>
>
> On Thu, Jul 11, 2013 at 2:49 PM, Mathieu Dubois 
> <dub...@ya... <mailto:dub...@ya...>> wrote:
>
>     Hello,
>
>     I wanted to use PyTables in conjunction with multiprocessing for some
>     embarrassingly parallel tasks.
>
>     However, it seems that it is not possible. In the following (very
>     stupid) example, X is a Carray of size (100, 10) stored in the file
>     test.hdf5:
>
>     import tables
>
>     import multiprocessing
>
>     # Reload the data
>
>     h5file = tables.openFile('test.hdf5', mode='r')
>
>     X = h5file.root.X
>
>     # Use multiprocessing to perform a simple computation (column average)
>
>     def f(X):
>
>          name = multiprocessing.current_process().name
>
>          column = random.randint(0, n_features)
>
>          print '%s use column %i' % (name, column)
>
>          return X[:, column].mean()
>
>     p = multiprocessing.Pool(2)
>
>     col_mean = p.map(f, [X, X, X])
>
>     When executing it the following error:
>
>     Exception in thread Thread-2:
>
>     Traceback (most recent call last):
>
>        File "/usr/lib/python2.7/threading.py", line 551, in
>     __bootstrap_inner
>
>          self.run()
>
>        File "/usr/lib/python2.7/threading.py", line 504, in run
>
>          self.__target(*self.__args, **self.__kwargs)
>
>        File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in
>     _handle_tasks
>
>          put(task)
>
>     PicklingError: Can't pickle <type 'weakref'>: attribute lookup
>     __builtin__.weakref failed
>
>
>     I have googled for weakref and pickle but can't find a solution.
>
>     Any help?
>
>
> Hello Mathieu,
>
> I have used multiprocessing and files opened in read mode many times 
> so I am not sure what is going on here.
Thanks for your answer. Maybe you can point me to an working example?

> Could you provide the test.hdf5 file so that we could try to reproduce 
> this.
Here is the script that I have used to generate the data:

import tables

import numpy

# Create data & store it

n_features = 10

n_obs      = 100

X = numpy.random.rand(n_obs, n_features)

h5file = tables.openFile('test.hdf5', mode='w')

Xatom = tables.Atom.from_dtype(X.dtype)

Xhdf5 = h5file.createCArray(h5file.root, 'X', Xatom, X.shape)

Xhdf5[:] = X

h5file.close()

I hope it's not a stupid mistake. I am using PyTables 2.3.1 on Ubuntu 
12.04 (libhdf5 is 1.8.4patch1).

>     By the way, I have noticed that by slicing a Carray, I get a numpy
>     array
>     (I created the HDF5 file with numpy). Therefore, everything is
>     copied to
>     memory. Is there a way to avoid that?
>
>
> Only the slice that you ask for is brought into memory an it is 
> returned as a non-view numpy array.
OK. I may be careful about that.

>
> Be Well
> Anthony
>
>
>     Mathieu
>
>     ------------------------------------------------------------------------------
>     See everything from the browser to the database with AppDynamics
>     Get end-to-end visibility with application monitoring from AppDynamics
>     Isolate bottlenecks and diagnose root cause in seconds.
>     Start your free trial of AppDynamics Pro today!
>     http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>     _______________________________________________
>     Pytables-users mailing list
>     Pyt...@li...
>     <mailto:Pyt...@li...>
>     https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>
>
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users