From: Tim H. <tim...@co...> - 2004-02-11 23:20:11
|
An update: A little more tuning resulted in determinant and inverse being about 80x faster than the original numarray code and about 5 times faster than using NumPy for the same test cases I was using before (1000x4x4 matrices). If anyone is interested, let me know and I'll send you the code. -tim Tim Hochberg wrote: > > I discovered that some (all?) of the functions in > numarray.linear_algebra are very slow when operating on small > matrices. In particular, determinant and inverse are both more than 15 > times slower than their NumPy counterparts when operating on 4x4 > matrices. I assume that this is simply a result of numarray's higher > overhead. > > Normally the overhead of numarray is not much of a problem since when > I'm operating on lots of small data chunks I can usually agregate them > into larger chunks and operate on the big chunks. This is, of course, > the standard way to get decent performance in either numarray or > NumPy. However, because the functions in linear_algebra take only > rank-2 (or 1 in some cases) arrays, their is no way to aggregate the > small operations and thus things run quite slow. > > In order to address this I rewrote some of the functions in > linear_algebra to allow an additional, optional, dimension on the > input arrays. Rank-3 arrays are treated as being a set of matrices > that are indexed along the first axis of A. Thus determinant(A) is > essentially equivalent to array(map(determinant, A)) when A is rank-3. > See the attached file for more detail. > > By this trick and by some relentless tuning, I got the numarray > functions to run at about the same speed as their NumPy counterparts > when computing the determinants and inverses of 1000 4x4 matrices. > That's a humungous speedup. > > Is this approach worth pursuing for linear_algebra in general? I'll be > using these myself since I need the speed, although I may back out > some of the more aggresive tuning so I don't get bit if numarray's > internals change. I'll gladly donate this code to numarray if it's > wanted, and I'm willing to help convert the rest, although it probaly > wouldn't happen as fast as this stuff since I don't need it myself > presently. > > -tim > > [Use this with caution at this point -- I just got finished with a > tuning spree and there may well be some bugs] > >------------------------------------------------------------------------ > |