Thread: Re: [Numpy-discussion] .T Transpose shortcut for arrays again (Page 2)

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: Robert K. <rob...@gm...> - 2006-07-07 04:23:26

Bill Baxter wrote:
> On 7/7/06, *Robert Kern* <rob...@gm... 
> <mailto:rob...@gm...>> wrote:
> 
>     Bill Baxter wrote:
>      > Robert Kern wrote:
[snip]
>      >     I don't think that just because arrays are often used for linear
>      >     algebra that
>      >
>      >     linear algebra assumptions should be built in to the core
>     array type.
>      >
>      > It's not just that "arrays can be used for linear algebra".  It's
>     that
>      > linear algebra is the single most popular kind of numerical
>     computing in
>      > the world!  It's the foundation for a countless many fields.   What
>      > you're saying is like "grocery stores shouldn't devote so much shelf
>      > space to food, because food is just one of the products people
>     buy", or
>     [etc.]
> 
>     I'm sorry, but the argument-by-inappropriate-analogy is not
>     convincing. Just
>     because linear algebra is "the base" for a lot of numerical
>     computing does not
>     mean that everyone is using numpy arrays for linear algebra all the
>     time. Much
>     less does it mean that all of those conventions you've devised
>     should be shoved
>     into the core array type. I hold a higher standard for the design of
>     the core
>     array type than I do for the stuff around it. "It's convenient for
>     what I do,"
>     just doesn't rise to that level. There has to be more of an argument
>     for it.
> 
> My argument is not that "it's convenient for what I do", it's that "it's 
> convenient for what 90% of users want to do".  But unfortunately I can't 
> think of a good way to back up that claim with any sort of numbers. 

[snip]

> I am also curious, given the number of times I've heard this nebulous 
> argument of "there are lots kinds of numerical computing that don't 
> invlolve linear algebra", that no one ever seems to name any of these 
> "lots of kinds".  Statistics, maybe?  But you can find lots of linear 
> algebra in statistics.

That's because I'm not waving my hands at general fields of application. I'm 
talking about how people actually use array objects on a line-by-line basis. If 
I represent a dataset as an array and fit a nonlinear function to that dataset, 
am I using linear algebra at some level? Sure! Does having a .T attribute on 
that array help me at all? No. Arguing about how fundamental linear algebra is 
to numerical endeavors is entirely besides the point.

I'm not saying that people who do use arrays for linear algebra are rare or 
unimportant. It's that syntactical convenience for one set of conventional ways 
to use an array object, by itself, is not a good enough reason to add stuff to 
the core array object.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: Tim H. <tim...@co...> - 2006-07-07 07:25:09

Bill Baxter wrote:
> On 7/7/06, *Robert Kern* <rob...@gm... 
> <mailto:rob...@gm...>> wrote:
>
>     Bill Baxter wrote:
>     > I am also curious, given the number of times I've heard this
>     nebulous
>     > argument of "there are lots kinds of numerical computing that don't
>     > invlolve linear algebra", that no one ever seems to name any of
>     these
>     > "lots of kinds".  Statistics, maybe?  But you can find lots of
>     linear
>     > algebra in statistics.
>
>     That's because I'm not waving my hands at general fields of
>     application.  I'm
>     talking about how people actually use array objects on a
>     line-by-line basis. If
>     I represent a dataset as an array and fit a nonlinear function to
>     that dataset,
>     am I using linear algebra at some level? Sure! Does having a .T
>     attribute on
>     that array help me at all? No. Arguing about how fundamental
>     linear algebra is
>     to numerical endeavors is entirely besides the point.
>
>
> Ok.  If line-by-line usage is what everyone really means, then I'll 
> get off the linear algebra soap box, but that's not what it sounded 
> like to me.
>
> So, if you want to talk line-by-line, I really can't talk about much 
> beside my own code.  But I just grepped through it and out of 2445 
> non-empty lines of code:
>
> 927 lines contain '='
> 390 lines contain a '['
> 75 lines contain matrix,asmatrix, or mat
> ==>  47 lines contain a '.T' or '.transpose' of some sort.  <==
> 33 lines contain array, or asarray, or asanyarray
> 24 lines contain 'rand('  --- I use it for generating bogus test data 
> a lot
> 17 lines contain 'newaxis' or 'NewAxis'
> 16 lines contain 'zeros('
> 13 lines contain 'dot('
> 12 lines contain 'empty('
> 8 lines contain 'ones('
> 7 lines contain 'inv('
In my main project theres about 26 KLOC (including blank lines), 700 or 
so of which use numeric (I prefix everything with np. so it's easy to 
count. Of those lines 29 use transpose, and of those 29 lines at most 9 
could use a T attribute. It's probably far less than that since I didn't 
check the dimensionality of the arrays involved. Somewhere between 0 and 
5 seems likely.

>
> I'm pretty new to numpy, so that's all the code I got right now.  I'm 
> sure I've written many more lines of emails about numpy than I have 
> lines of actual numpy code.  :-/
>
> But from that, I can say that -- at least in my code -- transpose is 
> pretty common.  If someone can point me to some larger codebases 
> written in numpy or numeric, I'd be happy to do a similar analysis of 
> those.
>
>
>     I'm not saying that people who do use arrays for linear algebra
>     are rare or
>     unimportant. It's that syntactical convenience for one set of
>     conventional ways
>     to use an array object, by itself, is not a good enough reason to
>     add stuff to
>     the core array object.
>
>
> I wish I had a way to magically find out the distribution of array 
> dimensions used by all numpy and numeric code out there.  My guess is 
> it would be something like 1-d: 50%,  2-d: 30%, 3-d: 10%, everything 
> else: 10%.  I can't think of a good way to even get an estimate on 
> that.  But in any event, I'm positive ndims==2 is a significant 
> percentage of all usages.  It seems like the opponents to this idea 
> are suggesting the distribution is more flat than that.  But whatever 
> the distribution is, it has to have a fairly light tail since memory 
> usage is exponential in ndim.  If ndim == 20, then it takes 8 
> megabytes just to store the smallest possible non-degenerate array of 
> float64s ( i.e. a 2x2x2x2x...)
I would guess that it falls off fast after n=3, but that's just a guess. 
Personally, the majority of my code deals in 3D arrays (2x2xN and 4x4xN 
for the most part). These are arrays of vectors holding scattering data 
at N different frequency or time points. The 2D arrays that I do use are 
for rendering imaging (the actual rendering is done in C since Python 
wasn't fast enough and numpy wasn't really suitable for it). So, you see 
that for me at least, a T attribute is complete cruft. Useless for the 
3D arrays, not needed for the 2D arrays, and again useless for the 1D 
arrays. I suspect that in general, the image processing types, who use a 
lot of 2D arrays, are probably not heavy users of transpose, but I'm not 
certain of that.
>
> It seems crazy to even be arguing this.  Transposing is not some 
> specialized esoteric operation.  It's important enough that R and S 
> give it a one letter function, and Matlab, Scilab, K all give it a 
> single-character operator.  [*]   Whoever designed the numpy.matrix 
> class also thought it was worthy of a shortcut, and I think came up 
> with a pretty good syntax for it.   And the people who invented math 
> itself
?!

> decided it was worth assigning a 1-character exponent to it. 
>
> So I think there's a clear argument for having a .T attribute.  But 
> ok, let's say you're right, and a lot of people won't use it.  Fine.   
> IT WILL DO THEM ABSOLUTELY NO HARM.  They don't have to use it if they 
> don't like it!   Just ignore it.  Unlike a t() function, .T doesn't 
> pollute any namespace users can define symbols in, so you really can 
> just ignore it if you're not interested in using it.  It won't get in 
> your way.
This is a completely bogus argument. All features cost -- good and ill 
alike. There's implementation cost and maintenance cost, both likely 
small in this case, but not zero. There's cognitive costs associated 
with trying to hold all of the various numpy methods, attributes and 
functions in ones head at once. There's pedagogical costs trying to 
explain how things fit together. There's community costs in that people 
who are allegedly coding with core numpy end up using mutually 
incomprehensible dialects. TANSTAFL.

The ndarray object has far too many methods and attributes already IMO, 
and you have not made a case that I find convincing that this is 
important enough to further cruftify it.
>
> For the argument that ndarray should be pure like the driven snow, 
> just a raw container for n-dimensional data,
Did anyone make that argument. No? I didn't think so.

> I think that's what the basearray thing that goes into Python itself 
> should be.  ndarray is part of numpy and numpy is for numerical 
> computing.
And?

Regards,

-tim

>
> Regards,
> --Bill
>
> [*] Full disclosure: I did find two counter-examples -- Maple and 
> Mathematica.  Maple has only a transpose() function and Mathematica 
> has only Transpose[] (but you can use [esc]tr[esc] as a shortcut)  
> However, both of those packages are primarily known for their 
> _symbolic_ math capabilities, not their number crunching, so they less 
> are similar to numpy than R,S,K,Matlab and Scilab in that regard.
>
> ------------------------------------------------------------------------
>
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> ------------------------------------------------------------------------
>
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: Bill B. <wb...@gm...> - 2006-07-07 12:31:08

I think the thread to this point can be pretty much summarized by:

while True:
    Bill: "2D transpose is common so it should have a nice syntax"
    Tim, Robert, Sasha, and Ed: "No it's not."

Very well.  I think it may be a self fulfilling prophecy, though.  I.e. if
matrix operations are cumbersome to use, then -- surprise surprise -- the
large user base for matrix-like operations never materializes.  Potential
converts just give numpy the pass, and go to Octave or Scilab, or stick with
Matlab, R or S instead.

Why all the fuss about the .T?  Because any changes to functions (like
making ones() return a matrix) can easily be worked around on the user side,
as has been pointed out.  But as far as I know -- do correct me if I'm wrong
-- there's no good way for a user to add an attribute to an existing class.
After switching from matrices back to arrays, .T was the only thing I really
missed from numpy.matrix.

I would be all for a matrix class that was on equal footing with array and
as easy to use as matrices in Matlab.  But my experience using
numpy.matrixwas far from that, and, given the lack of enthusiasm for
matrices around
here, that seems unlikely to change.  However, I'm anxious to see what Ed
has up his sleeves in the other thread.

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: Robert H. <rhe...@ma...> - 2006-07-07 13:57:20

On Jul 6, 2006, at 2:54 PM, Robert Kern wrote:

> I don't think that just because arrays are often used for linear  
> algebra that
> linear algebra assumptions should be built in to the core array type.

True.  This argues against the MAH attributes.

However, I use transpose often when not dealing with linear algebra,  
in particular with reading in data, and putting various columns into  
variables.  Also, occasional in plotting (which expects things in  
'backward' order relative to x-y space), and communicating between  
fortran programs (which typically use 'forward' order (x, y, z)) and  
numpy (backward -- (z, x, y)).

I am very much in favor of .T, but it should be a full .transpose(),  
not just swap the last two axes.  I don't care so much for the others.

+1 for .T == .transpose()

-Rob

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: George N. <gn...@go...> - 2006-07-07 14:26:43

On 07/07/06, Robert Hetland <rhe...@ma...> wrote:
[snip]
> However, I use transpose often when not dealing with linear algebra, in
> particular with reading in data, and putting various columns into
> variables.  Also, occasional in plotting (which expects things in 'backward'
> order relative to x-y space), and communicating between fortran programs
> (which typically use 'forward' order (x, y, z)) and numpy (backward -- (z,
> x, y)).
>
This is my usage as well. Also my primitive knowledge of numpy
requires use of the transpose when iterating over indexes from where.
Moreover I think the notation .T is perfectly reasonable. So I agree
with:

> I am very much in favor of .T, but it should be a full .transpose(), not
> just swap the last two axes.  I don't care so much for the others.

+1 for .T == .transpose()

George Nurser.

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: David M. C. <co...@ph...> - 2006-07-07 20:22:26

On Fri, 7 Jul 2006 15:26:41 +0100
"George Nurser" <gn...@go...> wrote:

> On 07/07/06, Robert Hetland <rhe...@ma...> wrote:
> [snip]
> > However, I use transpose often when not dealing with linear algebra, in
> > particular with reading in data, and putting various columns into
> > variables.  Also, occasional in plotting (which expects things in
> > 'backward' order relative to x-y space), and communicating between
> > fortran programs (which typically use 'forward' order (x, y, z)) and
> > numpy (backward -- (z, x, y)).
> >
> This is my usage as well. Also my primitive knowledge of numpy
> requires use of the transpose when iterating over indexes from where.
> Moreover I think the notation .T is perfectly reasonable. So I agree
> with:

same.

> 
> > I am very much in favor of .T, but it should be a full .transpose(), not
> > just swap the last two axes.  I don't care so much for the others.
> 
> +1 for .T == .transpose()

Another +1 from me. If transpose was a shorter word I wouldn't care :-)

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...

Re: [Numpy-discussion] .T Transpose shortcut for arrays again

From: Christopher B. <Chr...@no...> - 2006-07-07 18:11:32

Robert Kern wrote:
>  Just
> because linear algebra is "the base" for a lot of numerical computing does not 
> mean that everyone is using numpy arrays for linear algebra all the time. Much 
> less does it mean that all of those conventions you've devised should be shoved 
> into the core array type.

I totally agree here. What bugged me most about MATLAB was that it was 
so darn Matrix/Linear Algebra centric.

Yes, much of the code I wrote used linear algebra, but mostly it was a 
tiny (though critical) part of the actual code: Lots of code to set up a 
matrix equation, then solve it. The solve it was one line of code. For 
the rest, I prefer an array approach.

A Matrix/Linear Algebra centric approach is good for some things, but I 
think it should be all or nothing. If you want it, then there should be 
a Matrix package, that includes the Matrix object, AND a matrix version 
of all the utility functions, like ones, zeros, etc. So all you would 
have to do is do:

from numpy.matrix import *

instead of
from numpy import *

and you'd get all the same stuff.

Most of what would need to be added to the matrix package would be 
pretty easy, boiler plate code. Then we'd need a bunch more testing to 
root out all the operations that returned arrays where they should 
return matrices.

If there is no one that wants to do all that work, then we have our answer.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chr...@no...

<< < 1 2 (Page 2 of 2)