From: Francesc A. <fa...@ca...> - 2005-09-29 16:30:42
|
A Dijous 29 Setembre 2005 16:44, Stefan Kuzminski va escriure: > Is there a way to specify the order of the column? Or do I have to > just ensure that any data I append to the table is in that order? There is. Use the qualifier 'pos' in Col declarations. Look at: http://pytables.sourceforge.net/html-doc/usersguide4.html#subsection4.16.1 and, for set the order in general nested types, look: http://pytables.sourceforge.net/html-doc/usersguide3.html#section3.6 > The selections are great, we are building tools like merges and sorts, > frequency counting and statistics. The merge and sort things tend to > be rowwise operations, ( pattering on SAS to deal with very large data > sets ) but I am beginning to realize that column operations are > faster. I guess I had naivly assumed that rows were contiguous in > memory ( and on disk ). Well, in fact they are (both in-memory and on-disk). Other thing is how numarray (and in particular) RecArray objects manage more efficiently this information. RecArray is, roughly speaking, a container for heterogeneous data types made of a set of NumArray objects that contains the (homogenous) data in columns. These NumArray objects has gaps in-between the elements of columns, being the size of the gap the size of the record. So, operations column-wise tend to be very fast as well. If you are getting the impression that row-wise operation are not efficient is probably because your code is making conversions to Python objects (most probably lists or tuples) behind the scenes, so loosing much of its potential speed. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |