From: Tim H. <tim...@co...> - 2004-02-11 00:27:58
Attachments:
linear_algebra_x.py
|
I discovered that some (all?) of the functions in numarray.linear_algebra are very slow when operating on small matrices. In particular, determinant and inverse are both more than 15 times slower than their NumPy counterparts when operating on 4x4 matrices. I assume that this is simply a result of numarray's higher overhead. Normally the overhead of numarray is not much of a problem since when I'm operating on lots of small data chunks I can usually agregate them into larger chunks and operate on the big chunks. This is, of course, the standard way to get decent performance in either numarray or NumPy. However, because the functions in linear_algebra take only rank-2 (or 1 in some cases) arrays, their is no way to aggregate the small operations and thus things run quite slow. In order to address this I rewrote some of the functions in linear_algebra to allow an additional, optional, dimension on the input arrays. Rank-3 arrays are treated as being a set of matrices that are indexed along the first axis of A. Thus determinant(A) is essentially equivalent to array(map(determinant, A)) when A is rank-3. See the attached file for more detail. By this trick and by some relentless tuning, I got the numarray functions to run at about the same speed as their NumPy counterparts when computing the determinants and inverses of 1000 4x4 matrices. That's a humungous speedup. Is this approach worth pursuing for linear_algebra in general? I'll be using these myself since I need the speed, although I may back out some of the more aggresive tuning so I don't get bit if numarray's internals change. I'll gladly donate this code to numarray if it's wanted, and I'm willing to help convert the rest, although it probaly wouldn't happen as fast as this stuff since I don't need it myself presently. -tim [Use this with caution at this point -- I just got finished with a tuning spree and there may well be some bugs] |
From: Tim H. <tim...@co...> - 2004-02-11 23:20:11
|
An update: A little more tuning resulted in determinant and inverse being about 80x faster than the original numarray code and about 5 times faster than using NumPy for the same test cases I was using before (1000x4x4 matrices). If anyone is interested, let me know and I'll send you the code. -tim Tim Hochberg wrote: > > I discovered that some (all?) of the functions in > numarray.linear_algebra are very slow when operating on small > matrices. In particular, determinant and inverse are both more than 15 > times slower than their NumPy counterparts when operating on 4x4 > matrices. I assume that this is simply a result of numarray's higher > overhead. > > Normally the overhead of numarray is not much of a problem since when > I'm operating on lots of small data chunks I can usually agregate them > into larger chunks and operate on the big chunks. This is, of course, > the standard way to get decent performance in either numarray or > NumPy. However, because the functions in linear_algebra take only > rank-2 (or 1 in some cases) arrays, their is no way to aggregate the > small operations and thus things run quite slow. > > In order to address this I rewrote some of the functions in > linear_algebra to allow an additional, optional, dimension on the > input arrays. Rank-3 arrays are treated as being a set of matrices > that are indexed along the first axis of A. Thus determinant(A) is > essentially equivalent to array(map(determinant, A)) when A is rank-3. > See the attached file for more detail. > > By this trick and by some relentless tuning, I got the numarray > functions to run at about the same speed as their NumPy counterparts > when computing the determinants and inverses of 1000 4x4 matrices. > That's a humungous speedup. > > Is this approach worth pursuing for linear_algebra in general? I'll be > using these myself since I need the speed, although I may back out > some of the more aggresive tuning so I don't get bit if numarray's > internals change. I'll gladly donate this code to numarray if it's > wanted, and I'm willing to help convert the rest, although it probaly > wouldn't happen as fast as this stuff since I don't need it myself > presently. > > -tim > > [Use this with caution at this point -- I just got finished with a > tuning spree and there may well be some bugs] > >------------------------------------------------------------------------ > |
From: Todd M. <jm...@st...> - 2004-02-12 11:33:33
|
On Wed, 2004-02-11 at 18:19, Tim Hochberg wrote: > An update: > > A little more tuning resulted in determinant and inverse being about 80x > faster than the original numarray code and about 5 times faster than > using NumPy for the same test cases I was using before (1000x4x4 > matrices). If anyone is interested, let me know and I'll send you the code. > > -tim I think we should use the work you've done here in numarray... so we're interested. Unless you object, I'll gleefully include your code as drop-in replacements for the existing routines. Regards, Todd -- Todd Miller <jm...@st...> |