From: David M. C. <co...@ph...> - 2006-06-10 21:42:58
|
On Sat, Jun 10, 2006 at 01:18:05PM -0700, Tim Hochberg wrote: > > I finally got around to cleaning up and checking in fromiter. As Travis > suggested, this version does not require that you specify count. From > the docstring: > > fromiter(...) > fromiter(iterable, dtype, count=-1) returns a new 1d array > initialized from iterable. If count is nonegative, the new array > will have count elements, otherwise it's size is determined by the > generator. > > If count is specified, it allocates the full array ahead of time. If it > is not, it periodically reallocates space for the array, allocating 50% > extra space each time and reallocating back to the final size at the end > (to give realloc a chance to reclaim any extra space). > > Speedwise, "fromiter(iterable, dtype, count)" is about twice as fast as > "array(list(iterable),dtype=dtype)". Omitting count slows things down by > about 15%; still much faster than using "array(list(...))". It also is > going to chew up more memory than if you include count, at least > temporarily, but still should typically use much less than the > "array(list(...))" approach. Can this be integrated into array() so that array(iterable, dtype=dtype) does the expected thing? Can you try to find the length of the iterable, with PySequence_Size() on the original object? This gets a bit iffy, as that might not be correct (but it could be used as a hint). What about iterables that return, say, tuples? Maybe add a shape argument, so that fromiter(iterable, dtype, count, shape=(None, 3)) expects elements from iterable that can be turned into arrays of shape (3,)? That could replace count, too. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |