|
From: Sasha <nd...@ma...> - 2006-02-03 23:02:52
|
On 2/3/06, Travis Oliphant <oli...@ee...> wrote: > I'm very concerned about the speed of PyArray_NewFromDescr. So, I > don't really want to make changes that will cause it to be slower for > all cases unless absolutely essential. > It is easy to change the code so that it only affects the branch in PyArray_NewFromDescr that currently raises an exception -- providing both strides but no buffer. There is no need to call _array_buffer_size if data is provided. > Could you give more examples of how you will be using these zero-stride > arrays? What problem are they actually solving? > Currently when I need to represent a statistic that is constant across population, I use scalars. In many cases this works because thanks to broadcasting rules a scalar behaves almost like a vector with equal elements. With the changes introduced in numpy, generic code that works on both scalars and vectors is becoming increasingly easier to write, but there are some cases where scalars cannot replace a vector with equal elements. For example, if you want to combine data for two populations and the data comes as two scalars, you need to somehow know the size of each population to add to the size of the result. A zero-stride array would solve this problem: it takes little memory, but unlike scalar knows its size. Another use that I was contemplating was to represent per-row or per-column mask in ma. It is often the case that in a rectangular matrix data may be missing only for an entire row. It is tempting to use rank-1 mask with an element for each row to represent this case. =20 That will work fine, but if you would not be able to use vectors to specify either per-row or per-column mask. With zero-stride array, you can use strides=3D(1,0) or strides=3D(0,1) and have the same memory use as with a vector. |