From: Bill B. <wb...@gm...> - 2006-07-05 04:03:36
|
Just wanted to make one last effort get a .T attribute for arrays, so that you can flip axes with a simple "a.T" instead of "a.transpose()", as with numpy matrix objects. If I recall, the main objection raised before was that there are lots of ways to transpose n-dimensional data. Fine, but the fact is that 2D arrays are pretty darn common, and so are a special case worth optimizing for. Furthermore transpose() won't go away if you do need to do some specific kind of axes swapping other than the default, so noone is really going to be harmed by adding it. I propose to make .T a synonym for .swapaxes(-2,-1) {*}, i.e. the last two axes are interchanged. This should also make it useful in many N-d array cases (whereas the default of .transpose() -- to completely reverse the order of all the axes -- is seldom what you want). Part of the thinking is that when you print an N-d array it's the last two dimensions that get printed like 2-d matrices separated by blank likes. You can think of it as some number of stacks of 2-d matrices. So this .T would just transpose those 2-d matrices in the printout. Those are the parts that are generally most contiguous in memory also, so it makes sense for 2-d matrix bits to be stored in those last two dimensions. Then, if there is a .T, it makes sense to also have .H which would basically be equivalent to .T.conjugate(). Finally, the matrix class has .A to get the underlying array -- it would also be nice to have a .M on array as a shortcut for asmatrix(). This one would be very handy for matrix users, I think, but I could go either way on that, having abandoned matrix myself. Ex: ones([4,4]).M Other possibilities: - Make .T a function, so that you can pass it the same info as .transpose(). Then the shortcut becomes a.T(), which isn't as nice, and isn't consistent with matrix's .T any more. - Just make .T raise an error for ndim>2. But I don't really see any benefit in making it an error as opposed to defining a reasonable default behavior. - Make .T on a 1-dim array return a 2-dim Nx1 array. (My default suggestion is to just leave it alone if ndim < 2, an exception would be another possiblility). Would make an easy way to create column vectors from arrays, but I can think of nothing else in Numpy that acts that way. This is not a 1.0 must have, as it introduces no backward compatibility issues. But it would be trivial to add if the will is there. {*} except that negative axes for swapaxes doesn't seem work currently, so instead it would need to be something like: a.transpose( a.shape[:-2] + (a.shape[-1],a.shape[-2]) ) with a check for "if ndim > 1", of course. --Bill |
From: Bill B. <wb...@gm...> - 2006-07-05 05:00:24
|
Slight correction. {*} except that negative axes for swapaxes doesn't seem work currently, so > instead it would need to be something like: > a.transpose( a.shape[:-2] + (a.shape[-1],a.shape[-2]) ) > with a check for "if ndim > 1", of course. > Apparently a.swapaxes(-2,-1) does work, and it does exactly what I am suggesting, including leaving zero-d and 1-d arrays alone. Not sure why I thought it wasn't working. So in short my proposal is to: -- make a.T a property of array that returns a.swapaxes(-2,-1), -- make a.H a property of array that returns a.conjugate().swapaxes(-2,-1) and maybe -- make a.M a property of array that returns numpy.asmatrix(a) --Bill |
From: Travis O. <oli...@ie...> - 2006-07-06 11:17:54
|
Bill Baxter wrote: > So in short my proposal is to: > -- make a.T a property of array that returns a.swapaxes(-2,-1), > -- make a.H a property of array that returns > a.conjugate().swapaxes(-2,-1) > and maybe > -- make a.M a property of array that returns numpy.asmatrix(a) I've tentatively implemented all of these suggestions as well as adding the .A attribute to the ndarray as well (so that all sub-classes and array scalars can get back a view as an ndarray). I did this to make it easier to do matrix-like calculations with or with-out matrices. Matrix-calculation flexibility is still a sore-spot for many and I think these syntatical-sugar attributes will help long term. If there are no strong objections, then the recent MATH attribute checkin will stay. If there are major objections, then we can back them out without too much trouble as well. -Travis |
From: Sasha <nd...@ma...> - 2006-07-06 16:21:52
|
I would like to raise a few objections going from mild to strong: 1. .T : I am mildly against it. As an inexpensive operation that returns a view (so that a.T[...] = makes sense) it is a reasonable candidate for an attribute. Unfortunately reversing the order of axes at least as reasonable as swapaxes(-2,-1) and swapaxes(-2,-1) is invalid for rank < 2. My main objection is that a.T is fairly cryptic - is there any other language that uses attribute for transpose? Adding .T to arrays will lead to less readable code because in expressions like "a * b.T" it will not be clear whether * is a matrix or elemenwise multiplication. 2. .H : This is an O(n^2) complexity operation returning a copy so it is not appropriate for an attribute. It does not make much sense for any type other than complex, so it's use is limited. 3. .M : I am strongly against this. It will create a circular dependency between ndarray and matrix. I would expect that asmatrix is mostly used to convert function arguments and for this purpose @matrix_args decorator would be a better solution to reduce code clutter. 4. .A : I have no clue what this one does, so I won't comment. On 7/6/06, Travis Oliphant <oli...@ie...> wrote: > Bill Baxter wrote: > > So in short my proposal is to: > > -- make a.T a property of array that returns a.swapaxes(-2,-1), > > -- make a.H a property of array that returns > > a.conjugate().swapaxes(-2,-1) > > and maybe > > -- make a.M a property of array that returns numpy.asmatrix(a) > > I've tentatively implemented all of these suggestions as well as adding > the .A attribute to the ndarray as well (so that all sub-classes and > array scalars can get back a view as an ndarray). > > I did this to make it easier to do matrix-like calculations with or > with-out matrices. Matrix-calculation flexibility is still a sore-spot > for many and I think these syntatical-sugar attributes will help long term. > > If there are no strong objections, then the recent MATH attribute > checkin will stay. If there are major objections, then we can back them > out without too much trouble as well. > > -Travis > > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Bill B. <wb...@gm...> - 2006-07-06 17:21:47
|
On 7/7/06, Sasha <nd...@ma...> wrote: > > I would like to raise a few objections going from mild to strong: > > 1. .T : I am mildly against it. As an inexpensive operation that > returns a view (so that a.T[...] = makes sense) it is a reasonable > candidate for an attribute. Unfortunately reversing the order of axes at least as reasonable as > swapaxes(-2,-1) I suppose reversing the order changes you from C ordering to Fortran ordering? Other than that I can't think of any good examples of why you'd want to completely reverse the order of all your axes. I think it's much more common to want to swap just two axes, and the last two seem a logical choice since a) in the default C-ordering they're the closest together in memory and b) they're the axes that are printed contiguously when you say "print A". and swapaxes(-2,-1) is > invalid for rank < 2. > At least in numpy 0.9.8, it's not invalid, it just doesn't do anything. My main objection is that a.T is fairly cryptic > - is there any other language that uses attribute for transpose? Does it matter what other languages do? It's not _that_ cryptic. The standard way to write transpose is with a little T superscript in the upper right. We can't do that with ASCII so the T just appears after the dot. Makes perfect sense to me. I'd vote for an operator if it were possible in python. Something like A^T would be neat, maybe, or matlab's single-quote operator. Adding .T to arrays will lead to less readable code because in > expressions like "a * b.T" it will not be clear whether * is a matrix > or elemenwise multiplication. That seems a pretty weak argument, since there are already lots of expressions you can write that don't make it clear whether some operation is a matrix operation or array operation. You could write a * b.transpose(1,0) right now and still not know whether it was matrix or element-wise multiplication. Or doing A[:,1] when you know A is 2-D -- does it give you a 1D thing back or a 2D thing back? That just comes down to it being difficult to determine the class of objects in Python by looking at code. > 2. .H : This is an O(n^2) complexity operation returning a copy so > it is not appropriate for an attribute. Not sure how you get O(n^2). It just requires flipping the sign on the imaginary part of each element in the array. So in my book that's O(n). But that does make it more expensive than O(1) transpose, yes. It does not make much sense > for any type other than complex, so it's use is limited. I personally don't think I have ever used a hermitian transpose in my life. So I can't really can't say how useful it is. But the makers of Matlab decided to make single quote ( e.g. A' ) be the hermitian transpose operator, and dot-single quote (e.g. A.') be the regular transpose. So I'm assuming it's common enough that the folks behind Matlab thought it wise to make it the 'default' style of transpose and give it a one-character operator. That's about the only evidence I have that it's a useful operation, though. In general, though, I do know that when you take good-ole algorithms for reals and extend them to complex numbers, things that were transpose for the reals become hermitian transposes for the complex version. 3. .M : I am strongly against this. It will create a circular > dependency between ndarray and matrix. I would expect that asmatrix > is mostly used to convert function arguments and for this purpose > @matrix_args decorator would be a better solution to reduce code > clutter. I'm kindof ambivalent about this one too. Assuming matrix is going to stay around and we actually want to encourage people to use it, I think having a .M on arrays is an improvement over the current situation. Arguments to functions expecting matrices are as you say one place where conversions are needed, but another place is on funtions like zeros and ones and empty. With the .M you can just say ones(2,2).M. But probably a better solution would be to have matrix versions of these in the library as an optional module to import so people could, say, import them as M and use M.ones(2,2). It does seem to me that in some sense matrix is supposed to be 'just another customized array subclass', like sparse or masked, so to have array aware of it this one particular subclass makes me a little uneasy. But if matrix really should be considered to be on par with array, then it makes sense. It's just like a mutually recursive data structure. Or you can think of matrix's inheritance from array being an implementation detail. 4. .A : I have no clue what this one does, so I won't comment. It returns the array. I think the idea was that you would always be able to say .A with array or anything derived from it. Currently you have to know you have a matrix before you can use the .A attribute. If you were wrong and it was actually an array, then you'll get an exception. It would be nicer to have X.A just return X if X is already an array. In short, I'm most enthusiastic about the .T attribute. Then, given a .T, it makes sense to have a .H, both to be consistent with matrix, but also since it seems to be a big deal in other math packages like matlab. Then given the current situation, I like the .M but I can imagine other ways to make .M less necessary. --bb On 7/6/06, Travis Oliphant <oli...@ie...> wrote: > > Bill Baxter wrote: > > > So in short my proposal is to: > > > -- make a.T a property of array that returns a.swapaxes(-2,-1), > > > -- make a.H a property of array that returns > > > a.conjugate().swapaxes(-2,-1) > > > and maybe > > > -- make a.M a property of array that returns numpy.asmatrix(a) > > > > I've tentatively implemented all of these suggestions as well as adding > > the .A attribute to the ndarray as well (so that all sub-classes and > > array scalars can get back a view as an ndarray). > > > > I did this to make it easier to do matrix-like calculations with or > > with-out matrices. Matrix-calculation flexibility is still a sore-spot > > for many and I think these syntatical-sugar attributes will help long > term. > > > > If there are no strong objections, then the recent MATH attribute > > checkin will stay. If there are major objections, then we can back them > > out without too much trouble as well. > > > > -Travis > > |
From: Sasha <nd...@ma...> - 2006-07-06 18:25:42
|
On 7/6/06, Bill Baxter <wb...@gm...> wrote: > > > On 7/7/06, Sasha <nd...@ma...> wrote: > ... I think it's much > more common to want to swap just two axes, and the last two seem a logical > choice since a) in the default C-ordering they're the closest together in > memory and b) they're the axes that are printed contiguously when you say > "print A". It all depends on how you want to interpret a rank-K tensor. You seem to advocate a view that it is a (K-2)-rank array of matrices and .T is an element-wise transpose operation. Alternatively I can expect that it is a matrix of (K-2)-rank arrays and then .T should be swapaxes(0,1). Do you have real-life applications of swapaxes(-2,-1) for rank > 2? > > > and swapaxes(-2,-1) is > > invalid for rank < 2. > > > At least in numpy 0.9.8, it's not invalid, it just doesn't do anything. > That's bad. What sense does it make to swap non-existing axes? Many people would expect transpose of a vector to be a matrix. This is the case in S+ and R. > > My main objection is that a.T is fairly cryptic > > - is there any other language that uses attribute for transpose? > > > Does it matter what other languages do? It's not _that_ cryptic. If something is clear and natural, chances are it was done before. For me prior art is always a useful guide when making a design choice. For example, in R, the transpose operation is t(a) and works on rank <= 2 only always returning rank-2. K (an APL-like language) overloads unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank. Both R and K solutions are implementable in Python with R using 3 characters and K using 1(!) compared to your two-character ".T" notation. I would suggest that when inventing something new, you should consider prior art and explain how you invention is better. That's why what other languages do matter. (After all, isn't 'T' chosen because "transpose" starts with "t" in the English language?) > The standard way to write transpose is with a little T superscript in the upper > right. We can't do that with ASCII so the T just appears after the dot. > Makes perfect sense to me. I'd vote for an operator if it were possible in > python. Something like A^T would be neat, maybe, or matlab's single-quote > operator. > Well, you could overload __rpow__ for a singleton T and spell it A**T ... (I hope no one will take that proposal seriosely). Visually, A.T looks more like a subscript rather than superscript. > > Adding .T to arrays will lead to less readable code because in > > expressions like "a * b.T" it will not be clear whether * is a matrix > > or elemenwise multiplication. > > > That seems a pretty weak argument, since there are already lots of > expressions you can write that don't make it clear whether some operation is > a matrix operation or array operation. This may be a weak argument for someone used to matrix notation, but for me seeing a.T means: beware - tricky stuff here. > You could write a * b.transpose(1,0) > right now and still not know whether it was matrix or element-wise > multiplication. Why would anyone do that if b was a matrix? > > 2. .H : This is an O(n^2) complexity operation returning a copy so > > it is not appropriate for an attribute. > > Not sure how you get O(n^2). It just requires flipping the sign on the > imaginary part of each element in the array. So in my book that's O(n). > But that does make it more expensive than O(1) transpose, yes. > In my book n is the size of the matrix as in "n x n matrix", but the argument stays with O(n) as well. > But probably a better solution > would be to have matrix versions of these in the library as an optional > module to import so people could, say, import them as M and use M.ones(2,2). > This is the solution used by ma, which is another argument for it. > In short, I'm most enthusiastic about the .T attribute. Then, given a .T, > it makes sense to have a .H, both to be consistent with matrix, but also > since it seems to be a big deal in other math packages like matlab. Then > given the current situation, I like the .M but I can imagine other ways to > make .M less necessary. > I only raised a mild objection against .T, but the slippery slope argument makes me dislike it much more. At the very least I would like to see a discussion of why a.T is better than t(a) or +a. |
From: Sasha <nd...@ma...> - 2006-07-06 20:30:49
|
On 7/6/06, Robert Kern <rob...@gm...> wrote: > ... > I don't think that just because arrays are often used for linear algebra that > linear algebra assumptions should be built in to the core array type. > In addition, transpose is a (rank-2) array or matrix operation and not a linear algebra operation. Transpose corresponds to the "adjoint" linear algebra operation if you represent vectors as single column matrices and co-vectors as single-row matrices. This is a convenient representation followed by much of the relevant literature, but it does not alow generalization beyond rank-2. Another useful feature is that inner product can be calculated as the matrix product as long as you accept a 1x1 matrix for a scalar. This feature does not work beyond rank-2 either because in order to do tensor inner product you have to be explicit about the axes being collapsed (for example using Einstein notation). Since ndarray does not distinguish between upper an lower indices, it is not possible distinguish between vectors and co-vectors in any way other than using matrix convention. This makes ndarrays a poor model for linear algebra tensors. |
From: Bill B. <wb...@gm...> - 2006-07-07 02:56:16
|
On 7/7/06, Robert Kern <rob...@gm...> wrote: > > Bill Baxter wrote: > > Robert Kern wrote: > > > > > > The slippery slope argument only applies to the .M, not the .T or .H. > > No, it was the "Let's have a .T attribute. And if we're going to do that, > then > we should also do this. And this. And this." There's no slippery slope there. It's just "Let's have a .T attribute, and if we have that then we should have .H also." Period. The slope stops there. The .M and .A are a separate issue. > I don't think that just because arrays are often used for linear > > algebra that > > > > linear algebra assumptions should be built in to the core array > type. > > > > It's not just that "arrays can be used for linear algebra". It's that > > linear algebra is the single most popular kind of numerical computing in > > the world! It's the foundation for a countless many fields. What > > you're saying is like "grocery stores shouldn't devote so much shelf > > space to food, because food is just one of the products people buy", or > [etc.] > > I'm sorry, but the argument-by-inappropriate-analogy is not convincing. > Just > because linear algebra is "the base" for a lot of numerical computing does > not > mean that everyone is using numpy arrays for linear algebra all the time. > Much > less does it mean that all of those conventions you've devised should be > shoved > into the core array type. I hold a higher standard for the design of the > core > array type than I do for the stuff around it. "It's convenient for what I > do," > just doesn't rise to that level. There has to be more of an argument for > it. My argument is not that "it's convenient for what I do", it's that "it's convenient for what 90% of users want to do". But unfortunately I can't think of a good way to back up that claim with any sort of numbers. But here's one I just found: http://www.netlib.org/master_counts2.html download statistics for various numerical libraries on netlib.org. The top 4 are all linear algebra related: /lapack <http://www.netlib.org/lapack/> 37,373,505 /lapack/lug<http://www.netlib.org/lapack/lug/> 19,908,865 /scalapack <http://www.netlib.org/scalapack/> 14,418,172 /linalg <http://www.netlib.org/linalg/> 11,091,511 The next three are more like general computing issues: parallelization lib, performance monitoring, benchmarks: /pvm3 <http://www.netlib.org/pvm3/> 10,360,012 /performance<http://www.netlib.org/performance/> 7,999,140 /benchmark <http://www.netlib.org/benchmark/> 7,775,600 Then the next one is more linear algebra. And that seems to hold pretty far down the list. It looks like mostly stuff that's either linear algebra related or parallelization/benchmarking related. And as another example, there's the success of higher level numerical environments like Matlab (and maybe R and S? and Mathematica, and Maple?) that have strong support for linear algebra right in the core, not requiring users to go into some syntax/library ghetto to use that functionality. I am also curious, given the number of times I've heard this nebulous argument of "there are lots kinds of numerical computing that don't invlolve linear algebra", that no one ever seems to name any of these "lots of kinds". Statistics, maybe? But you can find lots of linear algebra in statistics. --bb |
From: Sasha <nd...@ma...> - 2006-07-07 04:34:54
|
On 7/6/06, Bill Baxter <wb...@gm...> wrote: > ... > Yep, like Tim said. The usage is say a N sets of basis vectors. Each set > of basis vectors is a matrix. This brings up a feature that I really miss from numpy: an ability to do array([f(x) for x in a]) without python overhead. APL-like languages have a notion of "adverb" - a higher level operator that maps a function to a function. Numpy has some adverbs implemented as attributes to ufuncs: for example add.reduce is the same as +/ in K and add.accumulate is the same as +\ ('/' and '\' are 'over' and 'scan' adverbs in K). However, there is no way to do f/ or f\ where f is an arbitrary dyadic function. The equivalent of array([f(x) for x in a]) is spelled f'(a) in K (' is an adverb 'each'). The transpose operator (+) is swaps the first two axes, so in order to apply to the array of matrices, one would have to do +:'a (: in +: disambiguates + as a unary operator). I don't know of a good way to introduce adverbs in numpy, nor can I think of a good way to do list comprehensions, but array friendly versions of map, filter and reduce may be a good addition. These higher order functions may take an optional axes argument to deal with the higher rank arrays and may be optimized to recognize ufuncs so that map(f, a) could call f(a) and reduce(f, a) could do f.reduce(a) when f is a ufunc. [snip] > Either way swapaxes(-2,-1) is likely more likely to be what you want than > .transpose(). > Agree, but swapaxes(0, 1) is a close runner-up which is also known as zip in python. > Well, I would be really happy for .T to return an (N,1) column vector if > handed an (N,) 1-d array. But I'm pretty sure that would raise more furuor > among the readers of the list than leaving it 1-d. > Would you be even happier if .T would return a matrix? I hope not because my .M objection will apply. Maybe we can compromize by implementing a.T so that it raises ValueError unless rank(a) == 2 or at least unless rank(a) <= 2? > I have serious reservations about a function called t(). x,y,z, and t are > probably all in the top 10 variable names in scientific computing. > What about T()? > > > K (an APL-like language) overloads > > unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank. > > > Hmm. That's kind of interesting, it seems like an abuse of notation to me. > And precedence might be an issue too. The precedence of unary + isn't as > high as attribute access. It is high enough AFAICT - higher than any binary operator. > Anyway, as far as the meaning of + in K, I'm > guessing K's arrays are in Fortran order, so (0,1) axes vary the fastest. No, K has 1d arrays only, but they can be nested. Matrices are arrays of arrays and tensors are arrays of arrays of arrays ..., but you are right (0,1) swap is faster than (-2,-1) swap and this motivated the choice for the primitive. > I couldn't find any documentation for the K language from a quick search, > though. Kx Systems, the company behind K has replaced K with Q and pulled old manuals from the web. Q is close enough to K: see http://kx.com/q/d/k.txt for a terse summary. [snip] > > Why would anyone do that if b was a matrix? > Maybe because, like you, they think "that a.T is fairly cryptic". > If they are like me, they will not use numpy.matrix to begin with :-). > > > > But probably a better solution > > > would be to have matrix versions of these in the library as an optional > > > module to import so people could, say, import them as M and use > M.ones(2,2). > > > > > > > This is the solution used by ma, which is another argument for it. > > Yeh, I'm starting to think that's better than slapping an M attribute on > arrays, too. Is it hard to write a module like that? > Writing matrixutils with def zeros(shape, dtype=float): return asmatrix(zeros(shape, dtype)) is trivial, but matrixutils.zeros will have two python function calls overhead. This may be a case for making zeros a class method of ndarray that can be written in a way that will make inherited matrix.zeros do the right thing with no overhead. [snip] > * +A implies addition. No, it does not. Unary '+' is a noop. Does * imply multiplication or ** imply pow in f(*args, **kwds) to you? > The general rule with operator overloading is that > the overload should have the same general meaning as the original operator. Unary '+' has no preset meaning in plain python. It can be interpreted as transpose if you think of scalars as 1x1 matrices. > So overloading * for matrix multiplication makes sense. It depends on what you consider part of "general meaning". If the commutativity property is part of it then overloading * for matrix multiplication doesn't make sense. If the "general meaning" of unary + includes x = +x invariant, then you are right, but I am willing to relax that to x = ++x invariant when x is a non-symmetric matrix. > ... New users looking at something like A + +B are pretty > certain to be confused because they think they know what + means, but > they're wrong. In my experience new users don't realize that unary + is defined for arrays. Use of unary + with non-literal numbers is exotic enough that new users seeing "something like A + +B" will not assume that they know what it means. [snip] > * +A has different precedence than the usual transpose operator. (But I > can't think of a case where that would make a difference now.) > Maybe you can't because it doesn't? :-) > I would be willing to accept a .T that just threw an exception if ndim were > > 2. Aha! Let's start with an error unless ndim != 2. It is always easier to add good features than to remove bad ones. |
From: Tim H. <tim...@co...> - 2006-07-07 06:04:09
|
Sasha wrote: > On 7/6/06, Bill Baxter <wb...@gm...> wrote: > >> ... >> Yep, like Tim said. The usage is say a N sets of basis vectors. Each set >> of basis vectors is a matrix. >> > > This brings up a feature that I really miss from numpy: an ability to do > > array([f(x) for x in a]) > Please note that there is now a fromiter function so that much of the overhead of the above function can be removed by using: numpy.fromiter((f(x) for x in a), float) This won't generate an intermediate list or use significantly extra storage. I doubt it's a full replacement for adverbs as you've described below though. -tim > without python overhead. APL-like languages have a notion of "adverb" > - a higher level operator that maps a function to a function. Numpy > has some adverbs implemented as attributes to ufuncs: for example > add.reduce is the same as +/ in K and add.accumulate is the same as +\ > ('/' and '\' are 'over' and 'scan' adverbs in K). However, there is no > way to do f/ or f\ where f is an arbitrary dyadic function. > > The equivalent of array([f(x) for x in a]) is spelled f'(a) in K (' is > an adverb 'each'). The transpose operator (+) is swaps the first two > axes, so in order to apply to the array of matrices, one would have to > do +:'a (: in +: disambiguates + as a unary operator). > > I don't know of a good way to introduce adverbs in numpy, nor can I > think of a good way to do list comprehensions, but array friendly > versions of map, filter and reduce may be a good addition. These > higher order functions may take an optional axes argument to deal with > the higher rank arrays and may be optimized to recognize ufuncs so > that map(f, a) could call f(a) and reduce(f, a) could do f.reduce(a) > when f is a ufunc. > > [snip] > >> Either way swapaxes(-2,-1) is likely more likely to be what you want than >> .transpose(). >> >> > > Agree, but swapaxes(0, 1) is a close runner-up which is also known as > zip in python. > > >> Well, I would be really happy for .T to return an (N,1) column vector if >> handed an (N,) 1-d array. But I'm pretty sure that would raise more furuor >> among the readers of the list than leaving it 1-d. >> >> > > Would you be even happier if .T would return a matrix? I hope not > because my .M objection will apply. Maybe we can compromize by > implementing a.T so that it raises ValueError unless rank(a) == 2 or > at least unless rank(a) <= 2? > > >> I have serious reservations about a function called t(). x,y,z, and t are >> probably all in the top 10 variable names in scientific computing. >> >> > > What about T()? > > >>> K (an APL-like language) overloads >>> unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank. >>> >> Hmm. That's kind of interesting, it seems like an abuse of notation to me. >> And precedence might be an issue too. The precedence of unary + isn't as >> high as attribute access. >> > > It is high enough AFAICT - higher than any binary operator. > > >> Anyway, as far as the meaning of + in K, I'm >> guessing K's arrays are in Fortran order, so (0,1) axes vary the fastest. >> > > No, K has 1d arrays only, but they can be nested. Matrices are arrays > of arrays and tensors are arrays of arrays of arrays ..., but you are > right (0,1) swap is faster than (-2,-1) swap and this motivated the > choice for the primitive. > > >> I couldn't find any documentation for the K language from a quick search, >> though. >> > > Kx Systems, the company behind K has replaced K with Q and pulled old > manuals from the web. Q is close enough to K: see > http://kx.com/q/d/k.txt for a terse summary. > > [snip] > >>> Why would anyone do that if b was a matrix? >>> >> Maybe because, like you, they think "that a.T is fairly cryptic". >> >> > If they are like me, they will not use numpy.matrix to begin with :-). > > >>>> But probably a better solution >>>> would be to have matrix versions of these in the library as an optional >>>> module to import so people could, say, import them as M and use >>>> >> M.ones(2,2). >> >>> This is the solution used by ma, which is another argument for it. >>> >> Yeh, I'm starting to think that's better than slapping an M attribute on >> arrays, too. Is it hard to write a module like that? >> >> > > Writing matrixutils with > > def zeros(shape, dtype=float): > return asmatrix(zeros(shape, dtype)) > > is trivial, but matrixutils.zeros will have two python function calls > overhead. This may be a case for making zeros a class method of > ndarray that can be written in a way that will make inherited > matrix.zeros do the right thing with no overhead. > > [snip] > >> * +A implies addition. >> > No, it does not. Unary '+' is a noop. Does * imply multiplication or > ** imply pow in f(*args, **kwds) to you? > > >> The general rule with operator overloading is that >> the overload should have the same general meaning as the original operator. >> > Unary '+' has no preset meaning in plain python. It can be interpreted > as transpose if you think of scalars as 1x1 matrices. > > >> So overloading * for matrix multiplication makes sense. >> > > It depends on what you consider part of "general meaning". If the > commutativity property is part of it then overloading * for matrix > multiplication doesn't make sense. If the "general meaning" of unary + > includes x = +x invariant, then you are right, but I am willing to > relax that to x = ++x invariant when x is a non-symmetric matrix. > > >> ... New users looking at something like A + +B are pretty >> certain to be confused because they think they know what + means, but >> they're wrong. >> > > In my experience new users don't realize that unary + is defined for > arrays. Use of unary + with non-literal numbers is exotic enough that > new users seeing "something like A + +B" will not assume that they > know what it means. > > [snip] > >> * +A has different precedence than the usual transpose operator. (But I >> can't think of a case where that would make a difference now.) >> >> > Maybe you can't because it doesn't? :-) > > >> I would be willing to accept a .T that just threw an exception if ndim were >> >>> 2. >>> > > Aha! Let's start with an error unless ndim != 2. It is always easier > to add good features than to remove bad ones. > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > |
From: Gary R. <gr...@bi...> - 2006-07-07 14:45:43
|
Sasha wrote: > On 7/6/06, Bill Baxter <wb...@gm...> wrote: >> ... >> Yep, like Tim said. The usage is say a N sets of basis vectors. Each set >> of basis vectors is a matrix. > > This brings up a feature that I really miss from numpy: an ability to do > > array([f(x) for x in a]) > > without python overhead. I'd find this really useful too. I'm doing lots of this in my recent code: array([f(x,y) for x in a for y in b]).reshape(xxx) Gary R. |
From: Sebastian H. <ha...@ms...> - 2006-07-08 01:37:34
|
Sasha wrote: > On 7/6/06, Bill Baxter <wb...@gm...> wrote: >> ... >> Yep, like Tim said. The usage is say a N sets of basis vectors. Each set >> of basis vectors is a matrix. > > This brings up a feature that I really miss from numpy: an ability to do > > array([f(x) for x in a]) > > without python overhead. APL-like languages have a notion of "adverb" > - a higher level operator that maps a function to a function. Numpy > has some adverbs implemented as .... <snip> Hi, I was just reading through this thread and noticed that the above might be possibly done best with(a little extended version of) the numexpr module. Am I right !? Just wanted to post this comment about a package I'm really looking forward to using once I convert from numarray. Thanks for numpy !! Sebastian Haase UCSF |
From: Bill B. <wb...@gm...> - 2006-07-07 06:17:38
|
On 7/7/06, Robert Kern <rob...@gm...> wrote: > > Bill Baxter wrote: > > I am also curious, given the number of times I've heard this nebulous > > argument of "there are lots kinds of numerical computing that don't > > invlolve linear algebra", that no one ever seems to name any of these > > "lots of kinds". Statistics, maybe? But you can find lots of linear > > algebra in statistics. > > That's because I'm not waving my hands at general fields of application. > I'm > talking about how people actually use array objects on a line-by-line > basis. If > I represent a dataset as an array and fit a nonlinear function to that > dataset, > am I using linear algebra at some level? Sure! Does having a .T attribute > on > that array help me at all? No. Arguing about how fundamental linear > algebra is > to numerical endeavors is entirely besides the point. Ok. If line-by-line usage is what everyone really means, then I'll get off the linear algebra soap box, but that's not what it sounded like to me. So, if you want to talk line-by-line, I really can't talk about much beside my own code. But I just grepped through it and out of 2445 non-empty lines of code: 927 lines contain '=' 390 lines contain a '[' 75 lines contain matrix,asmatrix, or mat ==> 47 lines contain a '.T' or '.transpose' of some sort. <== 33 lines contain array, or asarray, or asanyarray 24 lines contain 'rand(' --- I use it for generating bogus test data a lot 17 lines contain 'newaxis' or 'NewAxis' 16 lines contain 'zeros(' 13 lines contain 'dot(' 12 lines contain 'empty(' 8 lines contain 'ones(' 7 lines contain 'inv(' I'm pretty new to numpy, so that's all the code I got right now. I'm sure I've written many more lines of emails about numpy than I have lines of actual numpy code. :-/ But from that, I can say that -- at least in my code -- transpose is pretty common. If someone can point me to some larger codebases written in numpy or numeric, I'd be happy to do a similar analysis of those. I'm not saying that people who do use arrays for linear algebra are rare or > unimportant. It's that syntactical convenience for one set of conventional > ways > to use an array object, by itself, is not a good enough reason to add > stuff to > the core array object. I wish I had a way to magically find out the distribution of array dimensions used by all numpy and numeric code out there. My guess is it would be something like 1-d: 50%, 2-d: 30%, 3-d: 10%, everything else: 10%. I can't think of a good way to even get an estimate on that. But in any event, I'm positive ndims==2 is a significant percentage of all usages. It seems like the opponents to this idea are suggesting the distribution is more flat than that. But whatever the distribution is, it has to have a fairly light tail since memory usage is exponential in ndim. If ndim == 20, then it takes 8 megabytes just to store the smallest possible non-degenerate array of float64s (i.e. a 2x2x2x2x...) It seems crazy to even be arguing this. Transposing is not some specialized esoteric operation. It's important enough that R and S give it a one letter function, and Matlab, Scilab, K all give it a single-character operator. [*] Whoever designed the numpy.matrix class also thought it was worthy of a shortcut, and I think came up with a pretty good syntax for it. And the people who invented math itself decided it was worth assigning a 1-character exponent to it. So I think there's a clear argument for having a .T attribute. But ok, let's say you're right, and a lot of people won't use it. Fine. IT WILL DO THEM ABSOLUTELY NO HARM. They don't have to use it if they don't like it! Just ignore it. Unlike a t() function, .T doesn't pollute any namespace users can define symbols in, so you really can just ignore it if you're not interested in using it. It won't get in your way. For the argument that ndarray should be pure like the driven snow, just a raw container for n-dimensional data, I think that's what the basearray thing that goes into Python itself should be. ndarray is part of numpy and numpy is for numerical computing. Regards, --Bill [*] Full disclosure: I did find two counter-examples -- Maple and Mathematica. Maple has only a transpose() function and Mathematica has only Transpose[] (but you can use [esc]tr[esc] as a shortcut) However, both of those packages are primarily known for their _symbolic_ math capabilities, not their number crunching, so they less are similar to numpy than R,S,K,Matlab and Scilab in that regard. |
From: Robert K. <rob...@gm...> - 2006-07-06 18:55:07
|
Travis Oliphant wrote: > Bill Baxter wrote: >> So in short my proposal is to: >> -- make a.T a property of array that returns a.swapaxes(-2,-1), >> -- make a.H a property of array that returns >> a.conjugate().swapaxes(-2,-1) >> and maybe >> -- make a.M a property of array that returns numpy.asmatrix(a) > > I've tentatively implemented all of these suggestions as well as adding > the .A attribute to the ndarray as well (so that all sub-classes and > array scalars can get back a view as an ndarray). > > I did this to make it easier to do matrix-like calculations with or > with-out matrices. Matrix-calculation flexibility is still a sore-spot > for many and I think these syntatical-sugar attributes will help long term. > > If there are no strong objections, then the recent MATH attribute > checkin will stay. If there are major objections, then we can back them > out without too much trouble as well. Like Sasha, I'm mildly opposed to .T (as a synonym for .transpose()) and much more opposed to the rest (including .T being a synonym for .swapaxes(-2, -1)). It's not often that a proposal carries with it its own slippery-slope argument against itself. I don't think that just because arrays are often used for linear algebra that linear algebra assumptions should be built in to the core array type. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Bill B. <wb...@gm...> - 2006-07-06 14:49:15
|
On 7/6/06, Tim Hochberg <tim...@co...> wrote: > > > -) Being able to distinguish between row and column vectors; I guess > > this is just not possible with arrays... > > > Why can't you distinguish between them the same way that the matrix > class does? Shape [1, N] is a row array, shape [N,1] is column array. Yep, that works. But there are still various annoyances. - You have to remeber to specify extra brackets all the time. Like array([[1,2,3]]) or array([[1],[2],[3]]). - And a slice of a vector out of a matrix has to be pumped back up to 2-D. If x has ndim==2, then to get a column out of it you have to do x[:,i,None] instead of just x[:,i]. To get a row you need x[j,None] instead of just x[j] Not horrible, but it feels a little klunky if you're used to something like Matlab. So matrix gets rid of a few annoyances like that ... and replaces them with a few of its own. :-) --bb |
From: Tim H. <tim...@co...> - 2006-07-06 15:45:43
|
Bill Baxter wrote: > On 7/6/06, *Tim Hochberg* <tim...@co... > <mailto:tim...@co...>> wrote: > > > -) Being able to distinguish between row and column vectors; I guess > > this is just not possible with arrays... > > > Why can't you distinguish between them the same way that the matrix > class does? Shape [1, N] is a row array, shape [N,1] is column array. > > > Yep, that works. But there are still various annoyances. > - You have to remeber to specify extra brackets all the time. Like > array([[1,2,3]]) or array([[1],[2],[3]]). This one I can't get excited about. If you are actually creating that many constant arrays, just define rowarray and colarray functions that add the appropriate dimensions for you. > - And a slice of a vector out of a matrix has to be pumped back up to > 2-D. If x has ndim==2, then to get a column out of it you have to do > x[:,i,None] instead of just x[:,i]. To get a row you need x[j,None] > instead of just x[j] Alternatively x[:,i:i+1], although that's not much better. > > Not horrible, but it feels a little klunky if you're used to something > like Matlab. Well Matlab is geared to matrices. The ndarray object has always been more or less a tensor. I can't help feeling that loading it up with matrix like methds is just going to lead to confusion and trouble. I would rather work things out so that we can have a pure matrix class and a pure ndarray class coexist in some sensible way. Figuring out how to do that well would have fringe benefits for other stuff (masked arrays, sparse arrays, user defined arrays of various types). > So matrix gets rid of a few annoyances like that ... and replaces them > with a few of its own. :-) In theory matrix should not be annoying to Matlab users since it's whole purpose is to keep matlab users happy. I think the big problem with matrix is that none of the developers use it as far as I know, so no one is motivated to clean up the rough edges. -tim |
From: Tim H. <tim...@co...> - 2006-07-06 18:51:01
|
Sasha wrote: > On 7/6/06, Bill Baxter <wb...@gm...> wrote: > >> On 7/7/06, Sasha <nd...@ma...> wrote: >> ... I think it's much >> more common to want to swap just two axes, and the last two seem a logical >> choice since a) in the default C-ordering they're the closest together in >> memory and b) they're the axes that are printed contiguously when you say >> "print A". >> > > It all depends on how you want to interpret a rank-K tensor. You seem > to advocate a view that it is a (K-2)-rank array of matrices and .T is > an element-wise transpose operation. Alternatively I can expect that > it is a matrix of (K-2)-rank arrays and then .T should be > swapaxes(0,1). Do you have real-life applications of swapaxes(-2,-1) > for rank > 2? > I do for what it's worth. At various times I use arrays of shape (n, n, d) (matrices of rank-1 arrays at you suggest above) and arrays of shape (d, n, n) (vectors of matrices as Bill proposes). Using swapaxes(-2, -1) would be right only half the time, but it would the current defaults for transpose are essentially never right for rank > 2. Then again they are easy to explain. > >>> and swapaxes(-2,-1) is >>> invalid for rank < 2. >>> >>> >> At least in numpy 0.9.8, it's not invalid, it just doesn't do anything. >> >> > > That's bad. What sense does it make to swap non-existing axes? Many > people would expect transpose of a vector to be a matrix. This is the > case in S+ and R. > So this is essentially turning a row vector into a column vector? Is that right? >>> My main objection is that a.T is fairly cryptic >>> - is there any other language that uses attribute for transpose? >>> >> Does it matter what other languages do? It's not _that_ cryptic. >> > > If something is clear and natural, chances are it was done before. > For me prior art is always a useful guide when making a design choice. > For example, in R, the transpose operation is t(a) and works on rank > <= 2 only always returning rank-2. K (an APL-like language) overloads > unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank. > Both R and K solutions are implementable in Python with R using 3 > characters and K using 1(!) compared to your two-character ".T" > notation. Overloading '+' sure seems perverse, but maybe that's just me. > I would suggest that when inventing something new, you > should consider prior art and explain how you invention is better. > That's why what other languages do matter. (After all, isn't 'T' > chosen because "transpose" starts with "t" in the English language?) > > >> The standard way to write transpose is with a little T superscript in the upper >> right. We can't do that with ASCII so the T just appears after the dot. >> Makes perfect sense to me. I'd vote for an operator if it were possible in >> python. Something like A^T would be neat, maybe, or matlab's single-quote >> operator. >> >> > > Well, you could overload __rpow__ for a singleton T and spell it A**T > ... (I hope no one will take that proposal seriosely). Visually, A.T > looks more like a subscript rather than superscript. > No, no no. Overload __rxor__, then you can spell it A^t, A^h, etc. Much better ;-). [Sadly, I almost like that....] > >>> Adding .T to arrays will lead to less readable code because in >>> expressions like "a * b.T" it will not be clear whether * is a matrix >>> or elemenwise multiplication. >>> >> That seems a pretty weak argument, since there are already lots of >> expressions you can write that don't make it clear whether some operation is >> a matrix operation or array operation. >> > > This may be a weak argument for someone used to matrix notation, but > for me seeing a.T means: beware - tricky stuff here. > > >> You could write a * b.transpose(1,0) >> right now and still not know whether it was matrix or element-wise >> multiplication. >> > > Why would anyone do that if b was a matrix? > > > >>> 2. .H : This is an O(n^2) complexity operation returning a copy so >>> it is not appropriate for an attribute. >>> >> Not sure how you get O(n^2). It just requires flipping the sign on the >> imaginary part of each element in the array. So in my book that's O(n). >> But that does make it more expensive than O(1) transpose, yes. >> >> > In my book n is the size of the matrix as in "n x n matrix", but the > argument stays with O(n) as well. > > > > >> But probably a better solution >> would be to have matrix versions of these in the library as an optional >> module to import so people could, say, import them as M and use M.ones(2,2). >> >> > > This is the solution used by ma, which is another argument for it. > > > >> In short, I'm most enthusiastic about the .T attribute. Then, given a .T, >> it makes sense to have a .H, both to be consistent with matrix, but also >> since it seems to be a big deal in other math packages like matlab. Then >> given the current situation, I like the .M but I can imagine other ways to >> make .M less necessary. >> >> > > I only raised a mild objection against .T, but the slippery slope > argument makes me dislike it much more. At the very least I would > like to see a discussion of why a.T is better than t(a) or +a. > Here's a half baked thought: if the objection to t(A) is that it doesn't mirror the formulae where t appears as a subscript after A. Conceivably, __call__ could be defined so that A(x) returns x(A). That's kind of perverse, but it means that A(t), A(h), etc. could all work appropriately for suitably defined singletons. These singletons could either be assembeled in some abbreviations namespace or brought in by the programmer using "import transpose as t", etc. The latter works for doing t(a) as well of course. -tim |
From: Alexander B. <ale...@gm...> - 2006-07-06 19:18:47
|
On 7/6/06, Tim Hochberg <tim...@co...> wrote: > ... > So this is essentially turning a row vector into a column vector? Is > that right? > Being a definition, this is neither right nor wrong. It all depends on what you are using it for. If you want to distinguish row and column vectors, you have to use rank-2, but it is convenient to to think of rank-1 as an equivalent of either row or column and rank-0 as a 1x1 array. If you define transpose as a rank-2 only operation, but allow promotion of rank < 2 to rank 2, you end up with the rules of S+. |
From: Alexander B. <ale...@gm...> - 2006-07-06 19:37:35
|
On 7/6/06, Tim Hochberg <tim...@co...> wrote: > ... > Overloading '+' sure seems perverse, but maybe that's just me. > The first time I saw it, it seemed perverse to me as well, but it actually make a lot of sense: 1. It is visually appealing as in '+' makes '|' from '-' and '-' from '|' and looks close enough to 't'. 2. It puts an otherwise useless operator to work. 3. Prefix spelling suggests that it should be swapaxes(0,1) rather than swapaxes(-2,-1), which is the choice made by K. 4. You can't get any shorter than that (at least using a fixed width font :-). 5. It already does the right thing for rank<2. |
From: Sasha <nd...@ma...> - 2006-07-06 19:38:02
|
On 7/6/06, Tim Hochberg <tim...@co...> wrote: > ... > Overloading '+' sure seems perverse, but maybe that's just me. > The first time I saw it, it seemed perverse to me as well, but it actually make a lot of sense: 1. It is visually appealing as in '+' makes '|' from '-' and '-' from '|' and looks close enough to 't'. 2. It puts an otherwise useless operator to work. 3. Prefix spelling suggests that it should be swapaxes(0,1) rather than swapaxes(-2,-1), which is the choice made by K. 4. You can't get any shorter than that (at least using a fixed width font :-). 5. It already does the right thing for rank<2. |
From: Tim H. <tim...@co...> - 2006-07-06 20:23:27
|
Alexander Belopolsky wrote: > On 7/6/06, Tim Hochberg <tim...@co...> wrote: >> ... >> Overloading '+' sure seems perverse, but maybe that's just me. >> > The first time I saw it, it seemed perverse to me as well, but it > actually make a lot of sense: > > 1. It is visually appealing as in '+' makes '|' from '-' and '-' from > '|' and looks close enough to 't'. It looks even closer to † (dagger if that doesn't make it through) which is the symbol used for the hermitian adjoint. > 2. It puts an otherwise useless operator to work. > 3. Prefix spelling suggests that it should be swapaxes(0,1) rather > than swapaxes(-2,-1), which is the choice made by K. > 4. You can't get any shorter than that (at least using a fixed width > font :-). > 5. It already does the right thing for rank<2. Perhaps it's not as perverse as it first appears. Although I still don't have to like it ;-) -tim > > |
From: Sasha <nd...@ma...> - 2006-07-06 20:41:48
|
On 7/6/06, Tim Hochberg <tim...@co...> wrote: > ... > It looks even closer to =86 (dagger if that doesn't make it through) whic= h > is the symbol used for the hermitian adjoint. If it pleases the matlab crowd, '+' can be defined to do the hermitian adjoint. on the complex type. > ... > Perhaps it's not as perverse as it first appears. Although I still don't > have to like it ;-) I don't like it either, but I don't like .T even more. These days I hate functionality I cannot google for. Call me selfish, but I already know what unary '+' can do to a higher rank array, but with .T I will always have to look up which axes it swaps ... |
From: Tim H. <tim...@co...> - 2006-07-06 22:11:25
|
Sasha wrote: > On 7/6/06, Robert Kern <rob...@gm...> wrote: > >> ... >> I don't think that just because arrays are often used for linear algebra that >> linear algebra assumptions should be built in to the core array type. >> >> > > In addition, transpose is a (rank-2) array or matrix operation and not > a linear algebra operation. Transpose corresponds to the "adjoint" > linear algebra operation if you represent vectors as single column > matrices and co-vectors as single-row matrices. This is a convenient > representation followed by much of the relevant literature, but it > does not alow generalization beyond rank-2. Another useful feature is > that inner product can be calculated as the matrix product as long as > you accept a 1x1 matrix for a scalar. This feature does not work > beyond rank-2 either because in order to do tensor inner product you > have to be explicit about the axes being collapsed (for example using > Einstein notation). > At various times, I've thought about how one might do Einstein notation within Python. About the best I could come up with was: A.ijk * B.klm or A("ijk") * B("klm") Neither is spectacular, the first is a cleaner notation, but conceptually messy since it abuses getattr. Both require some intermediate pseudo object that wraps the array as well as info about the indexing. > Since ndarray does not distinguish between upper an lower indices, it > is not possible distinguish between vectors and co-vectors in any way > other than using matrix convention. This makes ndarrays a poor model > for linear algebra tensors. > My tensor math is rusty, but isn't it possible to represent all ones tensors as either covariant and contravariant and just embed the information about the metric into the product operator? It would seem that the inability to specify lower and upper indices is not truly limiting, but the inability to specify what axis to contract over is a fundamental limitation of sorts. I'm sure I'm partly influenced by my feeling that in practice upper and lower indices (aka contra- and covariant- and mixed-tensors) would be a pain in the neck, but a more capable inner product operator might well be useful if we could come up with correct syntax. -tim |
From: Bill B. <wb...@gm...> - 2006-07-07 01:56:46
|
Tim Wrote: > That second argument is particularly uncompelling, but I think I agree > that in a vacuum swapaxes(-2,-1) would be a better choice for .T than > reversing the axes. However, we're not in a vacuum and there are several > reasons not to do this. > 1. A.T and A.transpose() should really have the same behavior. There may be a certain economy to that but I don't see why it should necessarily be so. Especially if it's agreed that the behavior .transpose() is not very useful. The use case for .T is primarily to make linear algebra stuff easier. If you're doing n-dim stuff and need something specific, you'll use the more general .transpose(). 2. Changing A.transpose would be one more backwards compatibility issue. Maybe it's a change worth making though, if we are right in saying that the current .transpose() for ndim>2 is hardly ever what you want. 3. Since, as far as I can tell there's not concise way of spelling > A.swapaxes(-2,-1) in terms of A.transpose it would make documenting and > explaining the default case harder. > Huh? A.swapaxes (-2,-1) is pretty concise. Why should it have to have an explanation in terms of A.transpose? Here's the explanation for the documentation: "A.T returns A with the last two axes transposed. It is equivalent to A.swapaxes (-2,-1). For a 2-d array, this is the usual matrix transpose." This just is a non-issue. Sasha wrote: > > more common to want to swap just two axes, and the last two seem a > logical > > choice since a) in the default C-ordering they're the closest together > in > > memory and b) they're the axes that are printed contiguously when you > say > > "print A". > > It all depends on how you want to interpret a rank-K tensor. You seem > to advocate a view that it is a (K-2)-rank array of matrices and .T is > an element-wise transpose operation. Alternatively I can expect that > it is a matrix of (K-2)-rank arrays and then .T should be > swapaxes(0,1). Do you have real-life applications of swapaxes(-2,-1) > for rank > 2? > Yep, like Tim said. The usage is say a N sets of basis vectors. Each set of basis vectors is a matrix. And say I have a different basis associated with each of N points in space. Usually I'll want to print it out organized by basis vector set. I.e. look at the matrix associated with each of the points. So it makes sense to organize it as shape=(N,a,b) so that if I print it I get something that's easy to interpret. If I set it up as shape=(a,b,N) then what's easiest to see in the print output is all N first basis vectors, all N second basis vectors, etc. Also again in a C memory layout, the last two axes are closest in memory, so it's more cache friendly to have the bits that will usually be used together in computations be on the trailing end. In matlab (which is fortran order), I do things the other way, with the N at the end of the shape. (And note that Matlab prints out the first two axes contiguously.) Either way swapaxes(-2,-1) is likely more likely to be what you want than .transpose(). > > and swapaxes(-2,-1) is > > > invalid for rank < 2. > > > > > At least in numpy 0.9.8, it's not invalid, it just doesn't do anything. > > > > > That's bad. What sense does it make to swap non-existing axes? Many > people would expect transpose of a vector to be a matrix. This is the > case in S+ and R. > Well, I would be really happy for .T to return an (N,1) column vector if handed an (N,) 1-d array. But I'm pretty sure that would raise more furuor among the readers of the list than leaving it 1-d. > > My main objection is that a.T is fairly cryptic > > > - is there any other language that uses attribute for transpose? > > > > > > Does it matter what other languages do? It's not _that_ cryptic. > If something is clear and natural, chances are it was done before. The thing is most other numerical computing languages were designed for doing numerical computing. They weren't designed originally for writing general purpose software, like Python was. So in matlab, for instance, transpose is a simple single-quote. But that doesn't help us decide what it should be in numpy. For me prior art is always a useful guide when making a design choice. > For example, in R, the transpose operation is t(a) and works on rank > <= 2 only always returning rank-2. I have serious reservations about a function called t(). x,y,z, and t are probably all in the top 10 variable names in scientific computing. K (an APL-like language) overloads > unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank. Hmm. That's kind of interesting, it seems like an abuse of notation to me. And precedence might be an issue too. The precedence of unary + isn't as high as attribute access. Anyway, as far as the meaning of + in K, I'm guessing K's arrays are in Fortran order, so (0,1) axes vary the fastest. I couldn't find any documentation for the K language from a quick search, though. Both R and K solutions are implementable in Python with R using 3 > characters and K using 1(!) compared to your two-character ".T" > notation. I would suggest that when inventing something new, you > should consider prior art and explain how you invention is better. > That's why what other languages do matter. (After all, isn't 'T' > chosen because "transpose" starts with "t" in the English language?) Yes you're right. My main thought was just what I said above, that there probably aren't too many other examples that can really apply in this case, both because most numerical computing languages are custom-designed for numerical computing, and also because Python's attributes are also kind of uncommon among programming languages. So it's worth looking at other examples, but in the end it has to be something that makes sense for a numerical computing package written in Python, and there aren't too many examples of that around. > You could write a * b.transpose(1,0) > > right now and still not know whether it was matrix or element-wise > > multiplication. > > Why would anyone do that if b was a matrix? > Maybe because, like you, they think "that a.T is fairly cryptic". > But probably a better solution > > would be to have matrix versions of these in the library as an optional > > module to import so people could, say, import them as M and use M.ones > (2,2). > > > This is the solution used by ma, which is another argument for it. > Yeh, I'm starting to think that's better than slapping an M attribute on arrays, too. Is it hard to write a module like that? I only raised a mild objection against .T, but the slippery slope > argument makes me dislike it much more. At the very least I would > like to see a discussion of why a.T is better than t(a) or +a. > * A.T puts the T on the proper side of A, so in that sense it looks more like the standard math notation. * A.T has precedence that roughly matches the standard math notation * t(A) uses an impossibly short function name that's likely to conflict with local variable names. To avoid the conflict people will just end up using it as numpy.t(A), at which point it's value as a shortcut for transpose is nullified. Or they'll have to do a mini-import within specific functions ("from numpy import t") to localize the namespace pollution. But at that point they might as well just say " t= numpy.transpose". * t(A) puts the transpose operator on the wrong side of A * +A puts the transpose operator on the wrong side of A also. * +A implies addition. The general rule with operator overloading is that the overload should have the same general meaning as the original operator. So overloading * for matrix multiplication makes sense. Overloading & for would be a bad idea. New users looking at something like A + +B are pretty certain to be confused because they think they know what + means, but they're wrong. If you see A + B.T, you either know what it means or you know immediately that you don't know what it means and you go look it up. * +A has different precedence than the usual transpose operator. (But I can't think of a case where that would make a difference now.) Tim Hochberg wrote: > > Well, you could overload __rpow__ for a singleton T and spell it A**T > > ... (I hope no one will take that proposal seriosely). Visually, A.T > > looks more like a subscript rather than superscript. > > > No, no no. Overload __rxor__, then you can spell it A^t, A^h, etc. Much > better ;-). [Sadly, I almost like that....] > Ouch! No way! It's got even worse precedence problems than the +A proposal. How about A+B^t ? And you still have to introduce 'h' and 't' into the global namespace for it to work. Here's a half baked thought: if the objection to t(A) is that it doesn't > mirror the formulae where t appears as a subscript after A. Conceivably, > __call__ could be defined so that A(x) returns x(A). That's kind of > perverse, but it means that A(t), A(h), etc. could all work > appropriately for suitably defined singletons. These singletons could > either be assembeled in some abbreviations namespace or brought in by > the programmer using "import transpose as t", etc. The latter works for > doing t(a) as well of course. Same problem with the need for global t. And it is kind of perverse, besides. Robert Kern wrote: > Like Sasha, I'm mildly opposed to .T (as a synonym for .transpose()) and > much > more opposed to the rest (including .T being a synonym for .swapaxes(-2, > -1)). > It's not often that a proposal carries with it its own slippery-slope > argument > against itself. > The slippery slope argument only applies to the .M, not the .T or .H. And I think if there's a matrixutils module with redefinitions of ones and zeros etc, and if other functions are all truly fixed to preserve matrix when matrix is passed in, then I agree, there's not so much need for .M. I don't think that just because arrays are often used for linear algebra > that > linear algebra assumptions should be built in to the core array type. It's not just that "arrays can be used for linear algebra". It's that linear algebra is the single most popular kind of numerical computing in the world! It's the foundation for a countless many fields. What you're saying is like "grocery stores shouldn't devote so much shelf space to food, because food is just one of the products people buy", or "this mailing list shouldn't be conducted in English, because English is just one of the languages people can speak here", or "I don't think my keyboard should devote so much space to the A-Z keys, because there are so many characters in the Unicode character set that could be there instead", or to quote from a particular comedy troop: "Ah, how about Cheddar?" "Well, we don't get much call for it around here, sir." "Not much ca- It's the single most popular cheese in the world!" "Not round here, sir." Linear algebra is pretty much the 'cheddar' of the numerical computing world. But it's more than that. It's like the yeast of the beer world. Pretty much everything starts with it as a base. It makes sense to make it as convenient as possible to do with numpy, even if it is a "special case". I wish I could think of some sort of statistics or google search I could cite to back this claim up, but as far as my academic background from high school though Ph.D. goes, linear algebra is a mighty big deal, not merely an "also ran" in the world of math or numerical computing. Sasha Wrote: > In addition, transpose is a (rank-2) array or matrix operation and not > a linear algebra operation. Transpose corresponds to the "adjoint" > linear algebra operation if you represent vectors as single column > matrices and co-vectors as single-row matrices. This is a convenient > representation followed by much of the relevant literature, but it > does not alow generalization beyond rank-2. > I would be willing to accept a .T that just threw an exception if ndim were > 2. That's what Matlab does with its transpose operator. I don't like that behavior myself -- it seems wasteful when it could just have some well defined behavior that would let it be useful at least some of the time on N-d arrays. I don't like it either, but I don't like .T even more. These days I > hate functionality I cannot google for. Call me selfish, but I > already know what unary '+' can do to a higher rank array, but with .T > I will always have to look up which axes it swaps ... I think '.T' is more likely to be searchable than '+'. And when you say you already know what unary + can do, you mean because you've used K? That's not much use to the typical user, who also thinks they know what a unary + does, but they'd be wrong in this case. So, in summary, I vote for: - Keep the .T and the .H on array - Get rid of .M - Instead implement a matrix helper module that could be imported as M, allowing M.ones(...) etc. And also: - Be diligent about fixing any errors from matrix users along the lines of " numpy.foo returns an array when given a matrix" (Travis has been good about this -- but we need to keep it up.) Part of the motivation for .M attribute was just as a band-aid on the problem of matrices getting turned into arrays. Having .M means you can just slap a .M on the end of any result you aren't sure about. It's better (but harder) to fix the upstream problem of functions not preserving subtypes. |
From: Robert K. <rob...@gm...> - 2006-07-07 02:31:15
|
Bill Baxter wrote: > Robert Kern wrote: > > Like Sasha, I'm mildly opposed to .T (as a synonym for .transpose()) > and much > more opposed to the rest (including .T being a synonym for > .swapaxes(-2, -1)). > It's not often that a proposal carries with it its own > slippery-slope argument > against itself. > > The slippery slope argument only applies to the .M, not the .T or .H. No, it was the "Let's have a .T attribute. And if we're going to do that, then we should also do this. And this. And this." > I don't think that just because arrays are often used for linear > algebra that > > linear algebra assumptions should be built in to the core array type. > > It's not just that "arrays can be used for linear algebra". It's that > linear algebra is the single most popular kind of numerical computing in > the world! It's the foundation for a countless many fields. What > you're saying is like "grocery stores shouldn't devote so much shelf > space to food, because food is just one of the products people buy", or [etc.] I'm sorry, but the argument-by-inappropriate-analogy is not convincing. Just because linear algebra is "the base" for a lot of numerical computing does not mean that everyone is using numpy arrays for linear algebra all the time. Much less does it mean that all of those conventions you've devised should be shoved into the core array type. I hold a higher standard for the design of the core array type than I do for the stuff around it. "It's convenient for what I do," just doesn't rise to that level. There has to be more of an argument for it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |