|
From: Mathieu D. <dub...@ya...> - 2013-07-11 20:43:48
|
Le 11/07/2013 21:56, Anthony Scopatz a écrit :
>
>
>
> On Thu, Jul 11, 2013 at 2:49 PM, Mathieu Dubois
> <dub...@ya... <mailto:dub...@ya...>> wrote:
>
> Hello,
>
> I wanted to use PyTables in conjunction with multiprocessing for some
> embarrassingly parallel tasks.
>
> However, it seems that it is not possible. In the following (very
> stupid) example, X is a Carray of size (100, 10) stored in the file
> test.hdf5:
>
> import tables
>
> import multiprocessing
>
> # Reload the data
>
> h5file = tables.openFile('test.hdf5', mode='r')
>
> X = h5file.root.X
>
> # Use multiprocessing to perform a simple computation (column average)
>
> def f(X):
>
> name = multiprocessing.current_process().name
>
> column = random.randint(0, n_features)
>
> print '%s use column %i' % (name, column)
>
> return X[:, column].mean()
>
> p = multiprocessing.Pool(2)
>
> col_mean = p.map(f, [X, X, X])
>
> When executing it the following error:
>
> Exception in thread Thread-2:
>
> Traceback (most recent call last):
>
> File "/usr/lib/python2.7/threading.py", line 551, in
> __bootstrap_inner
>
> self.run()
>
> File "/usr/lib/python2.7/threading.py", line 504, in run
>
> self.__target(*self.__args, **self.__kwargs)
>
> File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in
> _handle_tasks
>
> put(task)
>
> PicklingError: Can't pickle <type 'weakref'>: attribute lookup
> __builtin__.weakref failed
>
>
> I have googled for weakref and pickle but can't find a solution.
>
> Any help?
>
>
> Hello Mathieu,
>
> I have used multiprocessing and files opened in read mode many times
> so I am not sure what is going on here.
Thanks for your answer. Maybe you can point me to an working example?
> Could you provide the test.hdf5 file so that we could try to reproduce
> this.
Here is the script that I have used to generate the data:
import tables
import numpy
# Create data & store it
n_features = 10
n_obs = 100
X = numpy.random.rand(n_obs, n_features)
h5file = tables.openFile('test.hdf5', mode='w')
Xatom = tables.Atom.from_dtype(X.dtype)
Xhdf5 = h5file.createCArray(h5file.root, 'X', Xatom, X.shape)
Xhdf5[:] = X
h5file.close()
I hope it's not a stupid mistake. I am using PyTables 2.3.1 on Ubuntu
12.04 (libhdf5 is 1.8.4patch1).
> By the way, I have noticed that by slicing a Carray, I get a numpy
> array
> (I created the HDF5 file with numpy). Therefore, everything is
> copied to
> memory. Is there a way to avoid that?
>
>
> Only the slice that you ask for is brought into memory an it is
> returned as a non-view numpy array.
OK. I may be careful about that.
>
> Be Well
> Anthony
>
>
> Mathieu
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> <mailto:Pyt...@li...>
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>
>
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
|