From: Keith G. <kwg...@gm...> - 2006-06-21 03:04:27
|
I have a matrix M and a vector (n by 1 matrix) V. I want to form a new matrix that contains the columns of M for which V > 0. One way to do that in Octave is M(:, find(V > 0)). How is it done in numpy? |
From: Bill B. <wb...@gm...> - 2006-06-21 03:33:46
|
I think that one's on the NumPy for Matlab users, no? http://www.scipy.org/NumPy_for_Matlab_Users >>> import numpy as num >>> a = num.arange (10).reshape(2,5) >>> a array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> v = num.rand(5) >>> v array([ 0.10934855, 0.55719644, 0.7044047 , 0.19250088, 0.94636972]) >>> num.where(v>0.5) (array([1, 2, 4]),) >>> a[:,num.where(v>0.5)] array([[[1, 2, 4]], [[6, 7, 9]]]) Seems it grows an extra set of brackets for some reason. Squeeze will get rid of them. >>> a[:,num.where(v>0.5)].squeeze() array([[1, 2, 4], [6, 7, 9]]) Not sure why the squeeze is needed. Maybe there's a better way. --bb On 6/21/06, Keith Goodman <kwg...@gm...> wrote: > > I have a matrix M and a vector (n by 1 matrix) V. I want to form a new > matrix that contains the columns of M for which V > 0. > > One way to do that in Octave is M(:, find(V > 0)). How is it done in > numpy? > > > |
From: Keith G. <kwg...@gm...> - 2006-06-21 03:49:30
|
On 6/20/06, Bill Baxter <wb...@gm...> wrote: > I think that one's on the NumPy for Matlab users, no? > > http://www.scipy.org/NumPy_for_Matlab_Users > > >>> import numpy as num > >>> a = num.arange (10).reshape(2,5) > >>> a > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) > >>> v = num.rand(5) > >>> v > array([ 0.10934855, 0.55719644, 0.7044047 , 0.19250088, 0.94636972]) > >>> num.where(v>0.5) > (array([1, 2, 4]),) > >>> a[:,num.where(v>0.5)] > array([[[1, 2, 4]], > > [[6, 7, 9]]]) > > Seems it grows an extra set of brackets for some reason. Squeeze will get > rid of them. > > >>> a[:,num.where(v>0.5)].squeeze() > array([[1, 2, 4], > [6, 7, 9]]) > > Not sure why the squeeze is needed. Maybe there's a better way. Thank you. That works for arrays, but not matrices. So do I need to do asarray(a)[:, where(asarray(v)>0.5)].squeeze() ? |
From: Erin S. <eri...@gm...> - 2006-06-21 04:10:11
|
On 6/20/06, Bill Baxter <wb...@gm...> wrote: > I think that one's on the NumPy for Matlab users, no? > > http://www.scipy.org/NumPy_for_Matlab_Users > > >>> import numpy as num > >>> a = num.arange (10).reshape(2,5) > >>> a > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) > >>> v = num.rand(5) > >>> v > array([ 0.10934855, 0.55719644, 0.7044047 , 0.19250088, 0.94636972]) > >>> num.where(v>0.5) > (array([1, 2, 4]),) > >>> a[:,num.where(v>0.5)] > array([[[1, 2, 4]], > > [[6, 7, 9]]]) > > Seems it grows an extra set of brackets for some reason. Squeeze will get > rid of them. > > >>> a[:,num.where(v>0.5)].squeeze() > array([[1, 2, 4], > [6, 7, 9]]) > > Not sure why the squeeze is needed. Maybe there's a better way. where returns a tuple of arrays. This can have unexpected results so you need to grab what you want explicitly: >>> (w,) = num.where(v>0.5) >>> a[:,w] array([[1, 2, 4], [6, 7, 9]]) |
From: Bill B. <wb...@gm...> - 2006-06-21 04:48:51
|
On 6/21/06, Erin Sheldon <eri...@gm...> wrote: > > On 6/20/06, Bill Baxter <wb...@gm...> wrote: > > I think that one's on the NumPy for Matlab users, no? > > > > http://www.scipy.org/NumPy_for_Matlab_Users > > > > >>> import numpy as num > > >>> a = num.arange (10).reshape(2,5) > > >>> a > > array([[0, 1, 2, 3, 4], > > [5, 6, 7, 8, 9]]) > > >>> v = num.rand(5) > > >>> v > > array([ 0.10934855, 0.55719644, 0.7044047 , 0.19250088, 0.94636972]) > > >>> num.where(v>0.5) > > (array([1, 2, 4]),) > > >>> a[:,num.where(v>0.5)] > > array([[[1, 2, 4]], > > > > [[6, 7, 9]]]) > > > > Seems it grows an extra set of brackets for some reason. Squeeze will > get > > rid of them. > > > > >>> a[:,num.where(v>0.5)].squeeze() > > array([[1, 2, 4], > > [6, 7, 9]]) > > > > Not sure why the squeeze is needed. Maybe there's a better way. > > where returns a tuple of arrays. This can have unexpected results > so you need to grab what you want explicitly: > > >>> (w,) = num.where(v>0.5) > >>> a[:,w] > array([[1, 2, 4], > [6, 7, 9]]) > Ah, yeh, that makes sense. Thanks for the explanation. So to turn it back into a one-liner you just need: >>> a[:,num.where(v>0.5)[0]] array([[1, 2, 4], [6, 7, 9]]) I'll put that up on the Matlab->Numpy page. --bb |
From: Simon B. <si...@ar...> - 2006-06-21 05:23:55
|
On Wed, 21 Jun 2006 13:48:48 +0900 "Bill Baxter" <wb...@gm...> wrote: > > >>> a[:,num.where(v>0.5)[0]] > array([[1, 2, 4], > [6, 7, 9]]) > > I'll put that up on the Matlab->Numpy page. oh, yuck. What about this: >>> a[:,num.nonzero(v>0.5)] array([[0, 1, 3], [5, 6, 8]]) >>> Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com |
From: Keith G. <kwg...@gm...> - 2006-06-21 14:14:24
|
On 6/20/06, Bill Baxter <wb...@gm...> wrote: > >>> a[:,num.where(v>0.5)[0]] > array([[1, 2, 4], > [6, 7, 9]]) > > I'll put that up on the Matlab->Numpy page. That's a great addition to the Matlab to Numpy page. But it only works if v is a column vector. If v is a row vector, then where(v.A > 0.5)[0] will return all zeros. So for row vectors it should be where(v.A > 0.5)[1]. Or, in general, where(v.flatten(1).A > 0.5)[1] |
From: Bill B. <wb...@gm...> - 2006-06-21 07:17:24
|
On 6/21/06, Simon Burton <si...@ar...> wrote: > > On Wed, 21 Jun 2006 13:48:48 +0900 > "Bill Baxter" <wb...@gm...> wrote: > > > > > >>> a[:,num.where(v>0.5)[0]] > > array([[1, 2, 4], > > [6, 7, 9]]) > > > > I'll put that up on the Matlab->Numpy page. > > oh, yuck. What about this: > > >>> a[:,num.nonzero(v>0.5)] > array([[0, 1, 3], > [5, 6, 8]]) > >>> The nonzero() function seems like kind of an anomaly in and of itself. It doesn't behave like other index-returning numpy functions, or even like the method version, v.nonzero(), which returns the typical tuple of array. So my feeling is ... ew to numpy.nonzero. --Bill |
From: Alan G I. <ai...@am...> - 2006-06-21 09:13:59
|
On Wed, 21 Jun 2006, Bill Baxter apparently wrote: > ew to numpy.nonzero I agree that having the method and function behave so differently is awkward; this was discussed before on this list. It does allow Simon's nicer solution, however. I'm not sure why bool arrays cannot be used as indices. The "natural" solution to the original problem seemed to be: M[:,V>0] but this is not allowed. Cheers, Alan Isaac |
From: Johannes L. <a.u...@gm...> - 2006-06-21 13:36:25
|
Hi, > I'm not sure why bool arrays cannot be used as indices. > The "natural" solution to the original problem seemed to be: > M[:,V>0] > but this is not allowed. I started a thread on this earlier this year. Try searching the archive for "boolean indexing" (if it comes back online somewhen). Travis had some reason for not implementing this, but unfortunately I do not remember what it was. The corresponding message might still linger on my home PC, which I can access this evening.... Johannes |
From: Travis O. <oli...@ie...> - 2006-06-21 16:50:34
|
Johannes Loehnert wrote: > Hi, > > >> I'm not sure why bool arrays cannot be used as indices. >> The "natural" solution to the original problem seemed to be: >> M[:,V>0] >> but this is not allowed. >> > > I started a thread on this earlier this year. Try searching the archive for > "boolean indexing" (if it comes back online somewhen). > > Travis had some reason for not implementing this, but unfortunately I do not > remember what it was. The corresponding message might still linger on my home > > PC, which I can access this evening.... > I suspect my reason was just not being sure if it could be explained consistently. But, after seeing this come up again. I decided it was easy enough to implement. So, in SVN NumPy, you will be able to do a[:,V>0] a[V>0,:] The V>0 will be replaced with integer arrays as if nonzero(V>0) had been called. -Travis |
From: Pau G. <pau...@gm...> - 2006-06-21 17:09:54
|
On 6/21/06, Travis Oliphant <oli...@ie...> wrote: > Johannes Loehnert wrote: > > Hi, > > > > > >> I'm not sure why bool arrays cannot be used as indices. > >> The "natural" solution to the original problem seemed to be: > >> M[:,V>0] > >> but this is not allowed. > >> > > > > I started a thread on this earlier this year. Try searching the archive for > > "boolean indexing" (if it comes back online somewhen). > > > > Travis had some reason for not implementing this, but unfortunately I do not > > remember what it was. The corresponding message might still linger on my home > > > > PC, which I can access this evening.... > > > > I suspect my reason was just not being sure if it could be explained > consistently. But, after seeing this come up again. I decided it was > easy enough to implement. > > So, in SVN NumPy, you will be able to do > > a[:,V>0] > a[V>0,:] > > The V>0 will be replaced with integer arrays as if nonzero(V>0) had been > called. > does it work for a[<boolean>,<boolean>] ? what about a[ix_( nonzero(<boolean>), nonzero(<boolean>) )] ? maybe the <boolean> to nonzero(<boolean>) conversion would be more coherently done by the ix_ function than by the [] pau |
From: Simon B. <si...@ar...> - 2006-06-22 02:20:28
|
On Wed, 21 Jun 2006 10:50:26 -0600 Travis Oliphant <oli...@ie...> wrote: > > So, in SVN NumPy, you will be able to do > > a[:,V>0] > a[V>0,:] > > The V>0 will be replaced with integer arrays as if nonzero(V>0) had been > called. OK. But just for the record, we should note how to do the operation that this used to do, eg. >>> a=numpy.array([1,2]) >>> a[[numpy.bool_(1)]] array([2]) >>> This could be a way of, say, maping a large boolean array onto some other values (1 or 2 in the above case). So, with the new implementation, is it possible to cast the bool array to an integer type without incurring a copy overhead ? And finally, is someone keeping track of the performance of array getitem ? It seems that as travis overloads it more and more it might then slow down in some cases. I must admit my vision is blurring and head is spining as numpy goes through these growing pains. I hope it's over soon. Not because I have trouble keeping up (although i do) but it's my matlab/R/numarray entrenched co-workers who cannot be exposed to this unstable development (they will run screaming to the woods). cheers, Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com |
From: Travis O. <oli...@ie...> - 2006-06-22 05:58:55
|
Simon Burton wrote: > On Wed, 21 Jun 2006 10:50:26 -0600 > Travis Oliphant <oli...@ie...> wrote: > > >> So, in SVN NumPy, you will be able to do >> >> a[:,V>0] >> a[V>0,:] >> >> The V>0 will be replaced with integer arrays as if nonzero(V>0) had been >> called. >> > > OK. > But just for the record, we should note how to > do the operation that this used to do, eg. > > >>>> a=numpy.array([1,2]) >>>> a[[numpy.bool_(1)]] >>>> > array([2] > This behavior hasn't changed... All that's changed is that what used to raise an error (boolean arrays in a tuple) now works in the same way that boolean arrays worked before. > > So, with the new implementation, is it possible to cast > the bool array to an integer type without incurring a copy overhead ? > I'm not sure what you mean. What copy overhead? There is still copying going on. The way it's been implemented, the boolean arrays get replaced with integer index arrays under the hood so it is really nearly identical to replacing the boolean array with nonzero(<boolean>). > And finally, is someone keeping track of the performance > of array getitem ? It seems that as travis overloads it more and > more it might then slow down in some cases. > Actually, I'm very concientious of the overhead of getitem in code that I add. I just today found a memory leak in code that was added that I did not review carefully that was also slowing down all accesses of arrays > 1d that resulted in array scalars. I added an optimization that should speed that up. But, it would be great if others could watch the speed changes for basic operations. > I must admit my vision is blurring and head is spining as numpy > goes through these growing pains The 1.0 beta release is coming shortly. I would like to see the first beta by the first of July. The final 1.0 release won't occur, though, until after SciPy 2006. Thanks for your patience. We've been doing a lot of house-cleaning lately to separate the "old but compatible" interface from the "new." This has resulted in some confusion, to be sure. Please don't hesitate to voice your concerns. -Travis |
From: Travis O. <oli...@ie...> - 2006-06-21 16:09:57
|
Bill Baxter wrote: > On 6/21/06, *Simon Burton* <si...@ar... > <mailto:si...@ar...>> wrote: > > On Wed, 21 Jun 2006 13:48:48 +0900 > "Bill Baxter" <wb...@gm... <mailto:wb...@gm...>> wrote: > > > > > >>> a[:,num.where(v>0.5)[0]] > > array([[1, 2, 4], > > [6, 7, 9]]) > > > > I'll put that up on the Matlab->Numpy page. > > oh, yuck. What about this: > > >>> a[:,num.nonzero(v>0.5)] > array([[0, 1, 3], > [5, 6, 8]]) > >>> > > > The nonzero() function seems like kind of an anomaly in and of > itself. It doesn't behave like other index-returning numpy > functions, or even like the method version, v.nonzero(), which returns > the typical tuple of array. So my feeling is ... ew to numpy.nonzero. How about we add the ability so that a[:, <boolean>] gets translated to a[:, nonzero(<boolean>)] ? -Travis |
From: Alan G I. <ai...@am...> - 2006-06-21 08:40:55
|
On Tue, 20 Jun 2006, Keith Goodman apparently wrote:=20 > I have a matrix M and a vector (n by 1 matrix) V. I want to form a new=20 > matrix that contains the columns of M for which V > 0.=20 > One way to do that in Octave is M(:, find(V > 0)). How is it done in nump= y?=20 M.transpose()[V>0] If you want the columns as columns, you can transpose again. hth, Alan Isaac |
From: Travis O. <oli...@ie...> - 2006-06-21 17:27:21
|
Pau Gargallo wrote: > On 6/21/06, Travis Oliphant <oli...@ie...> wrote: > >> Johannes Loehnert wrote: >> >>> Hi, >>> >>> >>> >>>> I'm not sure why bool arrays cannot be used as indices. >>>> The "natural" solution to the original problem seemed to be: >>>> M[:,V>0] >>>> but this is not allowed. >>>> >>>> >>> I started a thread on this earlier this year. Try searching the archive for >>> "boolean indexing" (if it comes back online somewhen). >>> >>> Travis had some reason for not implementing this, but unfortunately I do not >>> remember what it was. The corresponding message might still linger on my home >>> >>> PC, which I can access this evening.... >>> >>> >> I suspect my reason was just not being sure if it could be explained >> consistently. But, after seeing this come up again. I decided it was >> easy enough to implement. >> >> So, in SVN NumPy, you will be able to do >> >> a[:,V>0] >> a[V>0,:] >> >> The V>0 will be replaced with integer arrays as if nonzero(V>0) had been >> called. >> >> > > does it work for a[<boolean>,<boolean>] ? > Sure, it will work. Basically all boolean arrays will be interpreted as nonzero(V>0), everywhere. > what about a[ix_( nonzero(<boolean>), nonzero(<boolean>) )] ? > > maybe the <boolean> to nonzero(<boolean>) conversion would be more > coherently done by the ix_ function than by the [] > > I've just added support for <boolean> inside ix_ so that the nonzero will be done automatically as well. So a[ix_(<boolean>,<boolean>)] will give the cross-product selection. -Travis |
From: Pau G. <pau...@gm...> - 2006-06-21 17:31:50
|
On 6/21/06, Travis Oliphant <oli...@ie...> wrote: > Pau Gargallo wrote: > > On 6/21/06, Travis Oliphant <oli...@ie...> wrote: > > > >> Johannes Loehnert wrote: > >> > >>> Hi, > >>> > >>> > >>> > >>>> I'm not sure why bool arrays cannot be used as indices. > >>>> The "natural" solution to the original problem seemed to be: > >>>> M[:,V>0] > >>>> but this is not allowed. > >>>> > >>>> > >>> I started a thread on this earlier this year. Try searching the archive for > >>> "boolean indexing" (if it comes back online somewhen). > >>> > >>> Travis had some reason for not implementing this, but unfortunately I do not > >>> remember what it was. The corresponding message might still linger on my home > >>> > >>> PC, which I can access this evening.... > >>> > >>> > >> I suspect my reason was just not being sure if it could be explained > >> consistently. But, after seeing this come up again. I decided it was > >> easy enough to implement. > >> > >> So, in SVN NumPy, you will be able to do > >> > >> a[:,V>0] > >> a[V>0,:] > >> > >> The V>0 will be replaced with integer arrays as if nonzero(V>0) had been > >> called. > >> > >> > > > > does it work for a[<boolean>,<boolean>] ? > > > Sure, it will work. Basically all boolean arrays will be interpreted as > nonzero(V>0), everywhere. > > what about a[ix_( nonzero(<boolean>), nonzero(<boolean>) )] ? > > > > maybe the <boolean> to nonzero(<boolean>) conversion would be more > > coherently done by the ix_ function than by the [] > > > > > I've just added support for <boolean> inside ix_ so that the nonzero > will be done automatically as well. > > So > > a[ix_(<boolean>,<boolean>)] will give the cross-product selection. > ok so: a[ b1, b2 ] will be different than a[ ix_(b1,b2) ] just like with integer indices. Make sense to me. also, a[b] will be as before (a[where(b)]) ? maybe a trailing coma could lunch the new behaviour? a[b] -> a[where(b)] a[b,] -> a[b,...] -> a[nonzero(b)] Thanks, pau |
From: Pau G. <pau...@gm...> - 2006-06-22 10:26:20
|
''' The following mail is a bit long and tedious to read, sorry about that. Here is the abstract: "I would like boolean indexing to work like slices and not like arrays of indices" ''' hi, I'm _really_ sorry to insist, but I have been thinking on it and I don't feel like replacing <boolean> with nonzero(<boolean>) is what we want. For me this is a bad trick equivalent to replacing slices to arrays of indices with r_[<slice>]: - it works only if you do that for a single axis. Let me explain: if i have an array, >>> from numpy import * >>> a = arange(12).reshape(3,4) i can slice it: >>> a[1:3,0:3] array([[ 4, 5, 6], [ 8, 9, 10]]) i can define boolean arrays 'equivalent' to this slices >>> b1 = array([False,True,True]) # equivalent to 1:3 >>> b2 = array([True,True,True,False]) # equivalent to 0:3 now if i use one of this boolean arrays for indexing, all work like with slices: >>> a[b1,:] #same as a[1:3,:] array([[ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> a[:,b2] # same as a[:,0:3] array([[ 0, 1, 2], [ 4, 5, 6], [ 8, 9, 10]]) but if I use both at the same time: >>> a[b1,b2] # not equivalent to a[1:3,0:3] but to a[r_[1:3],r_[0:3]] Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: shape mismatch: objects cannot be broadcast to a single shape it doesn't work because nonzero(b1) and nonzero(b2) have different shapes. if I want the equivalent to a[1:3,1:3], i can do >>> a[ix_(b1,b2)] array([[ 4, 5, 6], [ 8, 9, 10]]) I can not see when the current behaviour of a[b1,b2] would be used. >From my (probably naive) point of view, <boolean> should not be converted to nonzero(<boolean>), but to some kind of slicing object. In that way boolean indexing could work like slices and not like arrays of integers, which will be more intuitive for me. Converting slices to arrays of indices is a trick that only works for one axis: >>> a[r_[1:3],0:3] #same as a[1:3,0:3] array([[ 4, 5, 6], [ 8, 9, 10]]) >>> a[1:3,r_[0:3]] #same as a[1:3,0:3] array([[ 4, 5, 6], [ 8, 9, 10]]) >>> a[r_[1:3],r_[0:3]] # NOT same as a[1:3,0:3] Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: shape mismatch: objects cannot be broadcast to a single shape am I completly wrong?? may be the current behaviour (only usefull for one axis) is enought?? sorry for asking things and not giving solutions and thanks for everything. pau PD: I noticed that the following code works >>> a[a>4,:,:,:,:,1:2:3,...,4:5:6] array([ 5, 6, 7, 8, 9, 10, 11]) |