From: Charles R H. <cha...@gm...> - 2006-08-26 17:49:35
|
On 8/26/06, Torgil Svensson <tor...@gm...> wrote: > > Hi > > ndarray.std(axis=1) seems to have memory issues on large 2D-arrays. I > first thought I had a performance issue but discovered that std() used > lots of memory and therefore caused lots of swapping. > > I want to get an array where element i is the stadard deviation of row > i in the 2D array. Using valgrind on the std() function... > > $ valgrind --tool=massif python -c "from numpy import *; > a=reshape(arange(100000*100),(100000,100)).std(axis=1)" > > ... showed me a peak of 200Mb memory while iterating line by line... > > $ valgrind --tool=massif python -c "from numpy import *; > a=array([x.std() for x in reshape(arange(100000*100),(100000,100))])" > > ... got a peak of 40Mb memory. > > This seems unnecessary since we know before calculations what the > output shape will be and should therefore be able to preallocate > memory. > > > My original problem was to get an moving average and a moving standard > deviation (120k rows and N=1000). For average I guess convolve should > perform good, but is there anything smart for std()? For now I use ... Why not use convolve for the std also? You can't subtract the average first, but you could convolve the square of the vector and then use some variant of std = sqrt((convsqrs - n*avg**2)/(n-1)). There are possible precision problems but they may not matter for your application, especially if the moving window isn't really big. Chuck |