Re: [Pytables-users] Indexed & .createIndex() runtime

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

A Dijous 13 Gener 2005 09:49, kevin lester va escriure:
> Sorry, I spoke to soon...
> The indexed=3D1 or the .createIndex() is still not being
> created. With the declaration (indexed=3D1), execution
> "tables" the data without acknowledging that indexing
> was not performed. With .createIndex() I'm getting a
> RuntimeError.
>=20
>   File "hdf5Extension.pyx", line 1411, in
> hdf5Extension.Group._g_createGroup
> RuntimeError: Can't create the group _i_raw_A1.

That's strange. Look at the code in the attachment based on your example as
a hint on how to use indexation; it works well on my laptop.

Running this code for a table with 100,000 rows, I've got the following:

$ python klester.py
Time for standard query--> 0.141028881073
Time for inkernel query--> 0.0666921138763
Time for indexed query--> 0.0650899410248

As you can see, there is no much point on indexing columns when the number
of rows is less than a million (see
http://pytables.sourceforge.net/doc/SciPy04.pdf, page 24). For one million
and up, indexation in pytables 0.9 is starting to be competitive (specially
if the index is already in the filesystem cache).

> Also Francesc, my understanding is, that this is the
> answer to "index persistence"; when an original index
> of an array is preserved after having performed "cuts"
> and other various "operations" on the data. IOW,
> unlike natural numarray characteristics which requires
> extra effort to maintain an array's original index
> references. Is this correct? If so, I have not yet
> figured out how I access it.

Sorry, I'm afraid I don't quite understand what you mean. Can you develop
this further?

Cheers,

=2D-=20
>OO< =A0 Francesc Altet    ||  http://www.carabos.com/
V =A0V =A0 Carabos Coop. V.  ||  Who is your data daddy? PyTables
 ""