From: Mathieu D. <dub...@ya...> - 2013-07-11 20:43:48
|
Le 11/07/2013 21:56, Anthony Scopatz a écrit : > > > > On Thu, Jul 11, 2013 at 2:49 PM, Mathieu Dubois > <dub...@ya... <mailto:dub...@ya...>> wrote: > > Hello, > > I wanted to use PyTables in conjunction with multiprocessing for some > embarrassingly parallel tasks. > > However, it seems that it is not possible. In the following (very > stupid) example, X is a Carray of size (100, 10) stored in the file > test.hdf5: > > import tables > > import multiprocessing > > # Reload the data > > h5file = tables.openFile('test.hdf5', mode='r') > > X = h5file.root.X > > # Use multiprocessing to perform a simple computation (column average) > > def f(X): > > name = multiprocessing.current_process().name > > column = random.randint(0, n_features) > > print '%s use column %i' % (name, column) > > return X[:, column].mean() > > p = multiprocessing.Pool(2) > > col_mean = p.map(f, [X, X, X]) > > When executing it the following error: > > Exception in thread Thread-2: > > Traceback (most recent call last): > > File "/usr/lib/python2.7/threading.py", line 551, in > __bootstrap_inner > > self.run() > > File "/usr/lib/python2.7/threading.py", line 504, in run > > self.__target(*self.__args, **self.__kwargs) > > File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in > _handle_tasks > > put(task) > > PicklingError: Can't pickle <type 'weakref'>: attribute lookup > __builtin__.weakref failed > > > I have googled for weakref and pickle but can't find a solution. > > Any help? > > > Hello Mathieu, > > I have used multiprocessing and files opened in read mode many times > so I am not sure what is going on here. Thanks for your answer. Maybe you can point me to an working example? > Could you provide the test.hdf5 file so that we could try to reproduce > this. Here is the script that I have used to generate the data: import tables import numpy # Create data & store it n_features = 10 n_obs = 100 X = numpy.random.rand(n_obs, n_features) h5file = tables.openFile('test.hdf5', mode='w') Xatom = tables.Atom.from_dtype(X.dtype) Xhdf5 = h5file.createCArray(h5file.root, 'X', Xatom, X.shape) Xhdf5[:] = X h5file.close() I hope it's not a stupid mistake. I am using PyTables 2.3.1 on Ubuntu 12.04 (libhdf5 is 1.8.4patch1). > By the way, I have noticed that by slicing a Carray, I get a numpy > array > (I created the HDF5 file with numpy). Therefore, everything is > copied to > memory. Is there a way to avoid that? > > > Only the slice that you ask for is brought into memory an it is > returned as a non-view numpy array. OK. I may be careful about that. > > Be Well > Anthony > > > Mathieu > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > <mailto:Pyt...@li...> > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > > > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users |