From: Bill B. <wb...@gm...> - 2006-07-07 02:21:01
|
On 7/7/06, Tim Hochberg <tim...@co...> wrote: > I'd caution here though that the H is another thing that's going to > encourage people to write code that's less accurate and slower than it > needs to be. Consider the simple equation: > > [Side note: Here's something I periodically think about -- it would be > nice if dot could take multiple arguments and chain them into a series > of matrix multiplies so that dot(dot(a, b), a.H) could be written dot(a, > b, a.H). I'll use that here for clarity]. > > x = dot(a, b, a.H) > > versus: > > x = dot(a.real, b, a.real.transpose()) + dot(a.imag, b, a.imag.transpose > ()) > > The latter should be about twice as fast and won't have any pesky > imaginary residuals showing up from the finite precision of the various > operations. The funny thing is that having a dot(a,b,c,...) would lead to the exact same kind of hidden performance problems you're arguing against. The cost of a series of matrix multiplications can vary dramatically with the order in which you perform them. E.g. A*B*C*v where A,B,C are 2-dim and v is a column vector. Then you should do (A*(B*(C*v))), and definitely not ((A*B)*C)*v) But that said, dot could maybe be made smart enough to figure out the best order. That would be a nice feature. dot(A, dot(B, dot(C, v))) is pretty darn ugly. I seem to remember finding the best order being an example of a dynamic programming problem, and probably O(N^2) in the number of terms. But presumably N is never going to be very big, and the N==2 (no choice) and N=3 (only one choice) cases could be optimized. --bb |
From: Tim H. <tim...@co...> - 2006-07-07 06:26:27
|
Bill Baxter wrote: > On 7/7/06, *Tim Hochberg* <tim...@co... > <mailto:tim...@co...>> wrote: > > I'd caution here though that the H is another thing that's going to > encourage people to write code that's less accurate and slower than it > needs to be. Consider the simple equation: > > [Side note: Here's something I periodically think about -- it > would be > nice if dot could take multiple arguments and chain them into a series > of matrix multiplies so that dot(dot(a, b), a.H) could be written > dot(a, > b, a.H). I'll use that here for clarity]. > > x = dot(a, b, a.H) > > versus: > > x = dot(a.real, b, a.real.transpose()) + dot(a.imag, b, > a.imag.transpose()) > > The latter should be about twice as fast and won't have any pesky > imaginary residuals showing up from the finite precision of the > various > operations. > > > The funny thing is that having a dot(a,b,c,...) would lead to the > exact same kind of hidden performance problems you're arguing against. Not exactly arguing -- this isn't why I don't like H and friends -- just noting that this is one of the traps that people are likely to fall into when transferring equations to code. > The cost of a series of matrix multiplications can vary dramatically > with the order in which you perform them. E.g. A*B*C*v where A,B,C > are 2-dim and v is a column vector. Then you should do (A*(B*(C*v))), > and definitely not ((A*B)*C)*v) That's a good point. > But that said, dot could maybe be made smart enough to figure out the > best order. That would be a nice feature. > dot(A, dot(B, dot(C, v))) is pretty darn ugly. I seem to remember > finding the best order being an example of a dynamic programming > problem, and probably O(N^2) in the number of terms. But presumably N > is never going to be very big, and the N==2 (no choice) and N=3 (only > one choice) cases could be optimized. You could probably just brute force it with out too much trouble. As you say O(n^2) is not that daunting when n is usually 2 or 3 and rarely over 5. Perhaps cache the last few configurations tested since the same matrix size combinations will likely occur over and over. Another interesting thought is one just wrapped an array with a wrapper that says "use the conjugate of what I'm wrapping", then dot(a, b H(a)) could do the write thing and only compute the real part of the equation, assuming H(a) created the wrapper. That's starting to get complex though. -tim |
From: Bill B. <wb...@gm...> - 2006-07-07 06:44:17
|
On 7/7/06, Tim Hochberg <tim...@co...> wrote: > > > The funny thing is that having a dot(a,b,c,...) would lead to the > > exact same kind of hidden performance problems you're arguing against. > Not exactly arguing -- this isn't why I don't like H and friends -- just > noting that this is one of the traps that people are likely to fall into > when transferring equations to code. There's a strong argument to be made that the whole design of most array math packages is flawed and leads to inefficient code. The classic example is something like: A = B + C - 2*D where all the matrices are 2million x 2million. I think for numpy that would basically do: tmp = B+C tmp2 = 2*D tmp3 = tmp - tmp2 A = tmp3 Allocating three huge tmp variables and probably doing an extra copy or two in there, when the best thing to do would be more like: A = D A *= -2 A += C A += B Or something like that. The point is that you give up any notion of having optimal code the minute you start using something like numpy. And you do so happily for the ability to get stuff done faster and have nicer looking, more readable code in the end. When everything works, that's when you hit the "go fast button" if you even really need to. --bb |
From: Tim H. <tim...@co...> - 2006-07-07 06:53:50
|
Bill Baxter wrote: > On 7/7/06, *Tim Hochberg* <tim...@co... > <mailto:tim...@co...>> wrote: > > > The funny thing is that having a dot(a,b,c,...) would lead to the > > exact same kind of hidden performance problems you're arguing > against. > Not exactly arguing -- this isn't why I don't like H and friends > -- just > noting that this is one of the traps that people are likely to > fall into > when transferring equations to code. > > > There's a strong argument to be made that the whole design of most > array math packages is flawed and leads to inefficient code. The > classic example is something like: > A = B + C - 2*D > where all the matrices are 2million x 2million. I think for numpy > that would basically do: > tmp = B+C > tmp2 = 2*D > tmp3 = tmp - tmp2 > A = tmp3 > > Allocating three huge tmp variables and probably doing an extra copy > or two in there, when the best thing to do would be more like: > A = D > A *= -2 > A += C > A += B > > Or something like that. Yeah -- that's not quite right since you've clobbered B. But that is essentially the dark road that people go down in performance critical sections on occasion. > The point is that you give up any notion of having optimal code the > minute you start using something like numpy. And you do so happily > for the ability to get stuff done faster and have nicer looking, more > readable code in the end. When everything works, that's when you hit > the "go fast button" if you even really need to. That's what numexpr is for. A = numexpr.evaluate("B + C - 2*D") Evaluates this considerably faster than numpy and doesn't chew up all that extra memory. Actually that's why it's faster -- it's much more memory friendly. It's also a lot less flexible, although that's been improving. Still you want numpy as a base, but numexpr is a viable alternative for those critical sections where once you would resort to x= or worse, three argument ufuncs. -tim > > --bb |
From: Charles R H. <cha...@gm...> - 2006-07-07 15:48:17
|
On 7/7/06, Bill Baxter <wb...@gm...> wrote: > > On 7/7/06, Tim Hochberg <tim...@co...> wrote: > > > > The funny thing is that having a dot(a,b,c,...) would lead to the > > > exact same kind of hidden performance problems you're arguing against. > > Not exactly arguing -- this isn't why I don't like H and friends -- just > > > > noting that this is one of the traps that people are likely to fall into > > when transferring equations to code. > > <snip> A = D > A *= -2 > A += C > A += B > I would like to write something like: A = D.copy().times(-2).plus(C).plus(B) i.e. copy produces a "register", the rest is reverse Polish, and = "stores" the result. Chuck |