From: Robert K. <rob...@gm...> - 2006-07-07 04:23:26
|
Bill Baxter wrote: > On 7/7/06, *Robert Kern* <rob...@gm... > <mailto:rob...@gm...>> wrote: > > Bill Baxter wrote: > > Robert Kern wrote: [snip] > > I don't think that just because arrays are often used for linear > > algebra that > > > > linear algebra assumptions should be built in to the core > array type. > > > > It's not just that "arrays can be used for linear algebra". It's > that > > linear algebra is the single most popular kind of numerical > computing in > > the world! It's the foundation for a countless many fields. What > > you're saying is like "grocery stores shouldn't devote so much shelf > > space to food, because food is just one of the products people > buy", or > [etc.] > > I'm sorry, but the argument-by-inappropriate-analogy is not > convincing. Just > because linear algebra is "the base" for a lot of numerical > computing does not > mean that everyone is using numpy arrays for linear algebra all the > time. Much > less does it mean that all of those conventions you've devised > should be shoved > into the core array type. I hold a higher standard for the design of > the core > array type than I do for the stuff around it. "It's convenient for > what I do," > just doesn't rise to that level. There has to be more of an argument > for it. > > My argument is not that "it's convenient for what I do", it's that "it's > convenient for what 90% of users want to do". But unfortunately I can't > think of a good way to back up that claim with any sort of numbers. [snip] > I am also curious, given the number of times I've heard this nebulous > argument of "there are lots kinds of numerical computing that don't > invlolve linear algebra", that no one ever seems to name any of these > "lots of kinds". Statistics, maybe? But you can find lots of linear > algebra in statistics. That's because I'm not waving my hands at general fields of application. I'm talking about how people actually use array objects on a line-by-line basis. If I represent a dataset as an array and fit a nonlinear function to that dataset, am I using linear algebra at some level? Sure! Does having a .T attribute on that array help me at all? No. Arguing about how fundamental linear algebra is to numerical endeavors is entirely besides the point. I'm not saying that people who do use arrays for linear algebra are rare or unimportant. It's that syntactical convenience for one set of conventional ways to use an array object, by itself, is not a good enough reason to add stuff to the core array object. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Tim H. <tim...@co...> - 2006-07-07 07:25:09
|
Bill Baxter wrote: > On 7/7/06, *Robert Kern* <rob...@gm... > <mailto:rob...@gm...>> wrote: > > Bill Baxter wrote: > > I am also curious, given the number of times I've heard this > nebulous > > argument of "there are lots kinds of numerical computing that don't > > invlolve linear algebra", that no one ever seems to name any of > these > > "lots of kinds". Statistics, maybe? But you can find lots of > linear > > algebra in statistics. > > That's because I'm not waving my hands at general fields of > application. I'm > talking about how people actually use array objects on a > line-by-line basis. If > I represent a dataset as an array and fit a nonlinear function to > that dataset, > am I using linear algebra at some level? Sure! Does having a .T > attribute on > that array help me at all? No. Arguing about how fundamental > linear algebra is > to numerical endeavors is entirely besides the point. > > > Ok. If line-by-line usage is what everyone really means, then I'll > get off the linear algebra soap box, but that's not what it sounded > like to me. > > So, if you want to talk line-by-line, I really can't talk about much > beside my own code. But I just grepped through it and out of 2445 > non-empty lines of code: > > 927 lines contain '=' > 390 lines contain a '[' > 75 lines contain matrix,asmatrix, or mat > ==> 47 lines contain a '.T' or '.transpose' of some sort. <== > 33 lines contain array, or asarray, or asanyarray > 24 lines contain 'rand(' --- I use it for generating bogus test data > a lot > 17 lines contain 'newaxis' or 'NewAxis' > 16 lines contain 'zeros(' > 13 lines contain 'dot(' > 12 lines contain 'empty(' > 8 lines contain 'ones(' > 7 lines contain 'inv(' In my main project theres about 26 KLOC (including blank lines), 700 or so of which use numeric (I prefix everything with np. so it's easy to count. Of those lines 29 use transpose, and of those 29 lines at most 9 could use a T attribute. It's probably far less than that since I didn't check the dimensionality of the arrays involved. Somewhere between 0 and 5 seems likely. > > I'm pretty new to numpy, so that's all the code I got right now. I'm > sure I've written many more lines of emails about numpy than I have > lines of actual numpy code. :-/ > > But from that, I can say that -- at least in my code -- transpose is > pretty common. If someone can point me to some larger codebases > written in numpy or numeric, I'd be happy to do a similar analysis of > those. > > > I'm not saying that people who do use arrays for linear algebra > are rare or > unimportant. It's that syntactical convenience for one set of > conventional ways > to use an array object, by itself, is not a good enough reason to > add stuff to > the core array object. > > > I wish I had a way to magically find out the distribution of array > dimensions used by all numpy and numeric code out there. My guess is > it would be something like 1-d: 50%, 2-d: 30%, 3-d: 10%, everything > else: 10%. I can't think of a good way to even get an estimate on > that. But in any event, I'm positive ndims==2 is a significant > percentage of all usages. It seems like the opponents to this idea > are suggesting the distribution is more flat than that. But whatever > the distribution is, it has to have a fairly light tail since memory > usage is exponential in ndim. If ndim == 20, then it takes 8 > megabytes just to store the smallest possible non-degenerate array of > float64s ( i.e. a 2x2x2x2x...) I would guess that it falls off fast after n=3, but that's just a guess. Personally, the majority of my code deals in 3D arrays (2x2xN and 4x4xN for the most part). These are arrays of vectors holding scattering data at N different frequency or time points. The 2D arrays that I do use are for rendering imaging (the actual rendering is done in C since Python wasn't fast enough and numpy wasn't really suitable for it). So, you see that for me at least, a T attribute is complete cruft. Useless for the 3D arrays, not needed for the 2D arrays, and again useless for the 1D arrays. I suspect that in general, the image processing types, who use a lot of 2D arrays, are probably not heavy users of transpose, but I'm not certain of that. > > It seems crazy to even be arguing this. Transposing is not some > specialized esoteric operation. It's important enough that R and S > give it a one letter function, and Matlab, Scilab, K all give it a > single-character operator. [*] Whoever designed the numpy.matrix > class also thought it was worthy of a shortcut, and I think came up > with a pretty good syntax for it. And the people who invented math > itself ?! > decided it was worth assigning a 1-character exponent to it. > > So I think there's a clear argument for having a .T attribute. But > ok, let's say you're right, and a lot of people won't use it. Fine. > IT WILL DO THEM ABSOLUTELY NO HARM. They don't have to use it if they > don't like it! Just ignore it. Unlike a t() function, .T doesn't > pollute any namespace users can define symbols in, so you really can > just ignore it if you're not interested in using it. It won't get in > your way. This is a completely bogus argument. All features cost -- good and ill alike. There's implementation cost and maintenance cost, both likely small in this case, but not zero. There's cognitive costs associated with trying to hold all of the various numpy methods, attributes and functions in ones head at once. There's pedagogical costs trying to explain how things fit together. There's community costs in that people who are allegedly coding with core numpy end up using mutually incomprehensible dialects. TANSTAFL. The ndarray object has far too many methods and attributes already IMO, and you have not made a case that I find convincing that this is important enough to further cruftify it. > > For the argument that ndarray should be pure like the driven snow, > just a raw container for n-dimensional data, Did anyone make that argument. No? I didn't think so. > I think that's what the basearray thing that goes into Python itself > should be. ndarray is part of numpy and numpy is for numerical > computing. And? Regards, -tim > > Regards, > --Bill > > [*] Full disclosure: I did find two counter-examples -- Maple and > Mathematica. Maple has only a transpose() function and Mathematica > has only Transpose[] (but you can use [esc]tr[esc] as a shortcut) > However, both of those packages are primarily known for their > _symbolic_ math capabilities, not their number crunching, so they less > are similar to numpy than R,S,K,Matlab and Scilab in that regard. > > ------------------------------------------------------------------------ > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Bill B. <wb...@gm...> - 2006-07-07 12:31:08
|
I think the thread to this point can be pretty much summarized by: while True: Bill: "2D transpose is common so it should have a nice syntax" Tim, Robert, Sasha, and Ed: "No it's not." Very well. I think it may be a self fulfilling prophecy, though. I.e. if matrix operations are cumbersome to use, then -- surprise surprise -- the large user base for matrix-like operations never materializes. Potential converts just give numpy the pass, and go to Octave or Scilab, or stick with Matlab, R or S instead. Why all the fuss about the .T? Because any changes to functions (like making ones() return a matrix) can easily be worked around on the user side, as has been pointed out. But as far as I know -- do correct me if I'm wrong -- there's no good way for a user to add an attribute to an existing class. After switching from matrices back to arrays, .T was the only thing I really missed from numpy.matrix. I would be all for a matrix class that was on equal footing with array and as easy to use as matrices in Matlab. But my experience using numpy.matrixwas far from that, and, given the lack of enthusiasm for matrices around here, that seems unlikely to change. However, I'm anxious to see what Ed has up his sleeves in the other thread. |
From: Robert H. <rhe...@ma...> - 2006-07-07 13:57:20
|
On Jul 6, 2006, at 2:54 PM, Robert Kern wrote: > I don't think that just because arrays are often used for linear > algebra that > linear algebra assumptions should be built in to the core array type. True. This argues against the MAH attributes. However, I use transpose often when not dealing with linear algebra, in particular with reading in data, and putting various columns into variables. Also, occasional in plotting (which expects things in 'backward' order relative to x-y space), and communicating between fortran programs (which typically use 'forward' order (x, y, z)) and numpy (backward -- (z, x, y)). I am very much in favor of .T, but it should be a full .transpose(), not just swap the last two axes. I don't care so much for the others. +1 for .T == .transpose() -Rob |
From: George N. <gn...@go...> - 2006-07-07 14:26:43
|
On 07/07/06, Robert Hetland <rhe...@ma...> wrote: [snip] > However, I use transpose often when not dealing with linear algebra, in > particular with reading in data, and putting various columns into > variables. Also, occasional in plotting (which expects things in 'backward' > order relative to x-y space), and communicating between fortran programs > (which typically use 'forward' order (x, y, z)) and numpy (backward -- (z, > x, y)). > This is my usage as well. Also my primitive knowledge of numpy requires use of the transpose when iterating over indexes from where. Moreover I think the notation .T is perfectly reasonable. So I agree with: > I am very much in favor of .T, but it should be a full .transpose(), not > just swap the last two axes. I don't care so much for the others. +1 for .T == .transpose() George Nurser. |
From: David M. C. <co...@ph...> - 2006-07-07 20:22:26
|
On Fri, 7 Jul 2006 15:26:41 +0100 "George Nurser" <gn...@go...> wrote: > On 07/07/06, Robert Hetland <rhe...@ma...> wrote: > [snip] > > However, I use transpose often when not dealing with linear algebra, in > > particular with reading in data, and putting various columns into > > variables. Also, occasional in plotting (which expects things in > > 'backward' order relative to x-y space), and communicating between > > fortran programs (which typically use 'forward' order (x, y, z)) and > > numpy (backward -- (z, x, y)). > > > This is my usage as well. Also my primitive knowledge of numpy > requires use of the transpose when iterating over indexes from where. > Moreover I think the notation .T is perfectly reasonable. So I agree > with: same. > > > I am very much in favor of .T, but it should be a full .transpose(), not > > just swap the last two axes. I don't care so much for the others. > > +1 for .T == .transpose() Another +1 from me. If transpose was a shorter word I wouldn't care :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Christopher B. <Chr...@no...> - 2006-07-07 18:11:32
|
Robert Kern wrote: > Just > because linear algebra is "the base" for a lot of numerical computing does not > mean that everyone is using numpy arrays for linear algebra all the time. Much > less does it mean that all of those conventions you've devised should be shoved > into the core array type. I totally agree here. What bugged me most about MATLAB was that it was so darn Matrix/Linear Algebra centric. Yes, much of the code I wrote used linear algebra, but mostly it was a tiny (though critical) part of the actual code: Lots of code to set up a matrix equation, then solve it. The solve it was one line of code. For the rest, I prefer an array approach. A Matrix/Linear Algebra centric approach is good for some things, but I think it should be all or nothing. If you want it, then there should be a Matrix package, that includes the Matrix object, AND a matrix version of all the utility functions, like ones, zeros, etc. So all you would have to do is do: from numpy.matrix import * instead of from numpy import * and you'd get all the same stuff. Most of what would need to be added to the matrix package would be pretty easy, boiler plate code. Then we'd need a bunch more testing to root out all the operations that returned arrays where they should return matrices. If there is no one that wants to do all that work, then we have our answer. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |