Re: [Pytables-users] Very slow access to EArrays

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

A Wednesday 18 June 2008, Glenn escrigué:
> Thank you for the help, what you say makes sense. In this
> application, I need to access the data both by rows and by columns in
> different loops. What is the best way to optimize this? Ideally I'd
> want to specify a different chunkshape for each loop, but I guess
> that would require storing two copies of the data (which is not
> impossible, if that's the only way).
> Basically I want to divide each row of data by the mean of that row,
> and then perform a calculation on each column.

Well, the ideal chunkshape depends on your own use.  As I said before, 
the default behaviour is to try to order data by row on disk.  If you 
need to have *also* reasonably performance for accessing it by column, 
I think that keeping two copies of the dataset is your best bet.  If 
that takes too much space on disk you may want to compress the data 
(I've noticed that you are not using the 'filters' parameter in the 
createEArray constructor).

Hope that helps,

Francesc

-- 
Francesc Alted
Freelance developer
Tel +34-964-282-249