|
From: <co...@ph...> - 2006-01-18 22:41:10
|
Fernando Perez <Fer...@co...> writes: > Travis Oliphant wrote: >> Andrew Straw wrote: >> >>> Here's an idea Fernando and I have briefly talked about off-list, >>> but which perhaps bears talking about here: Is there speed to be >>> gained by an alternative, very simple, very optimized ndarray >>> constructor? The idea would be a special-case constructor with very >>> limited functionality designed purely for speed. It wouldn't >>> support (m)any of the fantastic things Travis has done, but would >>> be useful only in specialized use cases, such as creating indices. >> The general purpose constructor is >> PyArray_NewFromDescr(...) >> I suspect this could be special cased for certain circumstances and >> the special-case called occasionally. Their are checks on the >> dimensions that could be avoided in certain circumstances (like when >> we are getting the dimensions from another arrayobject already...) >> We could also inline the __array_from_strides code... >> Besides that, I'm not sure what else to optimize... > > Just to give some context: this came to my mind inspired by Blitz++'s > TinyVector and TinyMatrix objects. In Blitz, arrays have compile-time > rank, but run-time size in all dimensions. Since this introduces some > overhead, Blitz offers also the Tiny* classes, which are compile-time > fixed _both_ in rank and in size. This allows a number of > optimizations to be made on them, at the cost of some flexibility > lost. Some info on these guys: > > http://www.oonumerics.org/blitz/manual/blitz07.html > > What Andrew and I discussed was the idea of writing some object which > would only support the most basic operations: element-wise arithmetic, > slicing, linear algebra calls on them (matrix-matrix, matrix-vector > and vector operations), and little else. I'd be OK losing fancy > indexing, byteswapping, memory-mapping, reshaping, and anything else > which costs either: > > 1. initialization-time CPU cycles > 2. memory footprint > 3. runtime element access and arithmetic. > > Such objects could be very useful in many contexts. I'd even like an > immutable version, so they could be used as dictionary keys without > having to make a tuple out of them. This would allow algorithms which > use small arrays as multidimensional indices in sparse tree structures > to be used without the hoops one must jump through today, and with > higher performance. I've done a little bit of work along these lines. I have a module I call vector3 [*] which has 2- and 3-dimensional immutable vectors, using either ints or doubles. It's as fast as I could make it, while keeping it all written in Pyrex. I find it very convenient for anything vector-related. Konrad Hinsen has something similiar in the development version of his ScientificPython package. [*] http://arbutus.mcmaster.ca/dmc/software/vector3.html Also, I've also done some playing around with a n-dimensional vector type (restricted to doubles). My best attempts make it ~4-5x faster than numpy (and 2x faster than Numeric) for vectors of dimension 10 on simple ops like + and *, 2x faster than numpy for dimension 1000, and approaching 1x as you make the vectors larger. Indexing is about 3x faster than numpy, and 1.4x faster than Numeric. So that gives I think some idea of the maximum speed-up possible. I think the speedups mostly come from the utter lack of any polymorphism: it handles vectors of doubles only, and only as contiguous vectors (no strides). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |