From: Dominik S. <do...@it...> - 2010-12-13 20:14:29
|
Thanks a lot for your insight. Regards, Dominik On Mon, Dec 13, 2010 at 8:09 PM, Francesc Alted <fa...@py...> wrote: > A Monday 13 December 2010 15:08:03 Francesc Alted escrigué: > > A Monday 13 December 2010 14:56:26 Dominik Szczerba escrigué: > > > > But, for knowing if accessing columns this is efficient for your > > > > case, I'd need more info on your datasets. Are they contiguous > > > > or chunked? If chunked, which is the chunkshape you have chosen? > > > > > > Both. Files saved from matlab are uncompressed/contiguous, the ones > > > saved from my program are usually compressed/chunked and the size > > > is around 1024^2/sizeof(type). > > > > Well, for PyTables (or any C application) and contiguous datasets, > > accessing data by columns is inefficient: the privileged direction > > for performance are rows. > > I was curious to see the difference in performance. Here are some > timings: > > >>> nptetra = np.empty((4, 4622544)) > >>> f = tb.openFile("/tmp/t.h5", "w") > >>> tetra = f.createArray(f.root, "tetra", nptetra) > >>> %time [ tetra[:,i] for i in range(4622544) ] > CPU times: user 201.61 s, sys: 162.59 s, total: 364.20 s > Wall time: 367.91 s > > Using the transposed version (i.e. accessing by rows): > > >>> tetra2 = f.createArray(f.root, "tetra2", nptetra.transpose()) > >>> %time [ tetra2[i] for i in range(4622544) ] > CPU times: user 163.78 s, sys: 0.48 s, total: 164.25 s > Wall time: 165.44 s # the time is more than 2x faster > > But using the iterator is the fastest mode (the I/O is buffered): > > >>> %time [ row for row in tetra2 ] > CPU times: user 26.21 s, sys: 0.38 s, total: 26.59 s > Wall time: 26.81 s > > I'd say that for chunked datasets you can expect something similar. > > -- > Francesc Alted > > > ------------------------------------------------------------------------------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |