From: Jon W. <js...@fn...> - 2012-06-06 16:18:33
|
Hi Anthony, > Well I think the issue at hand is that you are trying to support two > disparate cases with one expression: sparse and dense selection. We > have tools for dealing with these cases individually > and performing out-of-core calculations. And if you know a priori > which case you are going to fall into, you can do the right thing. So > without doing anything special, I think medium-fast is probably the > best and easiest thing that you can expect right now. (Though I would > be delighted to be proved wrong on this point.) True enough. Sometimes I can't know anything a priori about the density of the selection, of course. And I'm happy to worry about the internals, but some of my colleagues, not so much ;) > > but I think the ideal would be to have a .where type query > operator that returns Column objects or a Table object, with a > "view" imposed in either case. > > > We are very open to pull requests if you come up with an > implementation of this that you like more ;). Very fair. We'll see if I can get to it. Is there any sort of guide-to-the-source to help me get started when and if that happens? I guess just the reference guide? I'll have a lot to learn before I can contribute usefully, I'm sure. I don't know enough about the implementation details yet to know: would a selection make the out-of-core performance gains from chunking and other things moot because you'd have to skip around too much? Regards, Jon |