Re: [Pytables-users] clean up Table addressing?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Am Montag 31 Januar 2005 17:37 schrieb Francesc Altet:
> > The most complete set of parameters would be something like
> >  start=None, stop=None, step=1, rows=None, columns=None
> > where 'rows' does what 'sequence' and 'coords' to up to now. 'columns'
> > might not exist for all routines (e.g. remove) that can - by principle
> > only address whole rows.
>
> That sounds reasonable. What about making rows --> rowlist and
> columns --> columnlist?

How about rowselect and colselect? Probably a matter of taste. 'list' is fine 
with me as well.

> > I would suggest dropping the automatic sorting of sequences. Documenting
> > that unsorted lists kill the performance should be enough. I think it is
> > better if a user who is unaware of the issue gets bad performance than
> > wrong results.
>
> I disagree in this point. Sorting an object in-memory is a relatively
> fast operation, while retrieving an un-sorted sequence from disk can
> be *killer*. The default should be the solution that less impact on
> performance, and this is sorting my default. On optimization-consciuos
> user can read the manual and try to disable sorting, if appropriate.

OK, I agree that retrieving out of order is a killer. Still - if I give a list 
of rows to read, I would expect the result to be ordered in the same way. 
Maybe, the code could check whether the list is ordered and throw an error 
otherwise? Checking, whether a list is sorted is relatively inexpensive.

(OK, sorting an already sorted list is inexpensive as well, but still, I don't 
like the idea of silently changing the order of rows in a request.)

Alternatively, one could sort the list first and use the permutation of the 

 rowselect,permutation = zip(sorted(zip(rowselect,range(len(rowselect)))))
 result = read_rows(rowselect)
 dummy,result = zip(sorted(zip(permutation,result)))
 return result

but that's probably overkill...

> However, perhaps it could be useful to add 'step' for remove and
> implement this in as a sequence of remove(start,stop) that fakes the
> intended behaviour. It would not be very efficient, but...

The interface would be a lot cleaner, and if somebody really suffers from the 
bad performance, he might be more willing to work on it than now, where it is 
a mostly hypothetical issue... :-)

> Well, not me nor Ivan are going to address any of these issues for a
> while (at least in a couple of weeks or so). So feel free to download
> a recent snapshot (preferibly after this night, as I've fixed a couple
> of things today in Table.py):

OK, I'll probably do so in the near future.

Ciao,
Nobbi

-- 
_________________________________________Norbert Nemec
         Bernhardstr. 2 ... D-93053 Regensburg
     Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199
           eMail: <No...@Ne...>