From: T. C. <al...@an...> - 2004-05-12 19:44:59
|
Hello, I am new to numarray. I have been searching the documentation, and see no evident vectorial and concise way to get the indexes for the minimum of an array times.min(). Currently my two alternatives are: (times is NxN array) husband, wife =3D nonzero(equal(times.min(),times))=20 #gives tuple of 1x1 arrays, each containing one index and (even uglier) husband =3D compress(times.min()=3D=3Dtimes,indices([N,N])[0]) wife =3D compress(times.min()=3D=3Dtimes,indices([N,N])[1]) These are weird ways to get something as simple. I am surely missing something, but I have tried several slicing strategies before without success. For getting the minimum times in each row I use: choose(argmin(times),transpose(times)) What are the idioms in numpy for these tasks? Thank you very much in advance, =E1. I would=20 --=20 =C1lvaro Tejero Cantero http://alqua.org -- documentos libres free documents |
From: Russell E O. <ow...@as...> - 2004-05-12 19:55:32
|
At 9:51 PM +0200 5/12/04, =C1lvaro Tejero Cantero wrote: >I am new to numarray. I have been searching the documentation, and see >no evident vectorial and concise way to get the indexes for the minimum >of an array times.min(). To find the index of ONE minimum value is easy,=20 though it is buried in the nd_image sub-package=20 (where many users might miss it): numarray.nd_image.minimum_position I do not know a clean way to find all locations=20 of the minimum value. I hope somebody else does. -- Russell |
From: T. C. <al...@an...> - 2004-05-12 21:22:09
|
Hello, > To find the index of ONE minimum value is easy,=20 > though it is buried in the nd_image sub-package=20 > (where many users might miss it): > numarray.nd_image.minimum_position It works great... but what about efficiency? If I do times.min() and then numarray.nd_image.minimum_positioan(times) I am running twice essentially the same extremum-finding routine, which is prohibitibe for large N..., am I right? Which makes me thing of a more general question: I know that some of the array functions are coded in C for speed, but what about the classical python-for loop, as in (r Nx3 array of particle positions) [ [r[i]-r[j] for i in arange(N)] for j in arange(N)] is this handled to C code? > I do not know a clean way to find all locations=20 > of the minimum value. I hope somebody else does. Yes... although for the problem at hand that motivated my query, my times matrix is symmetric... I don't really need all the minima, but does numarray have any special datatype for symmetric matrixes, that prevents storage of unneded (e.g. supradiagonal) elements?. Thank you very much, I'm on my way to get some beautiful code out of old fortranisms=20 =E1. --=20 =C1lvaro Tejero Cantero http://alqua.org -- documentos libres free documents |
From: Perry G. <pe...@st...> - 2004-05-12 22:10:36
|
Álvaro Tejero Cantero wrote: > > It works great... but what about efficiency? If I do times.min() and > then numarray.nd_image.minimum_positioan(times) I am running twice > essentially the same extremum-finding routine, which is prohibitibe for > large N..., am I right? > Well, yes. But when you ask to find all the things that equal the minimum, you pretty much must look twice (if you want to know where they all are if more than one). Once to determine the minimum, the next time to locate all of them. You didn't really say whether you needed just the first minimum or all minima (if I recall correctly). > Which makes me thing of a more general question: I know that some of the > array functions are coded in C for speed, but what about the classical > python-for loop, as in (r Nx3 array of particle positions) > > [ [r[i]-r[j] for i in arange(N)] for j in arange(N)] > > is this handled to C code? > As Tim mentions, yes this can be done efficiently. But there is no general answer this sort of open question. It depends on what you are doing. Sometimes there are functions or tricks to avoid loops in Python, sometimes not. > > Yes... although for the problem at hand that motivated my query, my > times matrix is symmetric... I don't really need all the minima, but > does numarray have any special datatype for symmetric matrixes, that > prevents storage of unneded (e.g. supradiagonal) elements?. > Not for special cases like this. One could probably write a special subclass to do this, but for a savings of a factor of 2 in memory, it usually would not be worth the trouble (unlike sparse matrices) Perry |
From: Robert K. <rk...@uc...> - 2004-05-13 00:27:44
|
Perry Greenfield wrote: > =C1lvaro Tejero Cantero wrote: >=20 >>It works great... but what about efficiency? If I do times.min() and >>then numarray.nd_image.minimum_positioan(times) I am running twice >>essentially the same extremum-finding routine, which is prohibitibe for >>large N..., am I right? >> >=20 > Well, yes. But when you ask to find all the things that equal > the minimum, you pretty much must look twice (if you want to know > where they all are if more than one). Once to determine the > minimum, the next time to locate all of them. Nah, you can accumulate indices corresponding to the current minimum=20 value as you go. Discard the list of indices and start again if you get=20 a new minimum value. minval =3D data[0] # let's suppose data is a vector for demonstration minindices =3D [0] for i in xrange(1, len(data)): x =3D data[i] if x < minval: minindices =3D [i] minval =3D x elif x =3D=3D minval: minindices.append(i) Whether or not this is faster (when implemented in C) than going over it=20 twice using numarray functions is another question. My guess: not enough. [snip] >>Yes... although for the problem at hand that motivated my query, my >>times matrix is symmetric... I don't really need all the minima, but >>does numarray have any special datatype for symmetric matrixes, that >>prevents storage of unneded (e.g. supradiagonal) elements?. >> >=20 > Not for special cases like this. One could probably write a special > subclass to do this, but for a savings of a factor of 2 in memory, > it usually would not be worth the trouble (unlike sparse matrices) OTOH, having subclasses that follow LAPACK's symmetric packed storage=20 scheme would be very useful not because of the space factor but the time=20 saved by being able to use the symmetric algorithms in LAPACK. I think. > Perry --=20 Robert Kern rk...@uc... "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter |
From: Perry G. <pe...@st...> - 2004-05-13 13:28:13
|
Robert Kern wrote: > > Well, yes. But when you ask to find all the things that equal > > the minimum, you pretty much must look twice (if you want to know > > where they all are if more than one). Once to determine the > > minimum, the next time to locate all of them. > > Nah, you can accumulate indices corresponding to the current minimum > value as you go. Discard the list of indices and start again if you get > a new minimum value. > True enough (though I suspect some special cases may not work any faster, e.g., an array of all zeros with the last element equal to -1; you spend all the time copying indices for nearly the whole damn thing for no purpose). There are a bunch of things that can be done to eliminate multiple passes but they tend to lead to many different specialized functions. One has to trade off the number of such functions against the speed savings. Another example is getting max and min values for an array. I've long thought that this is so often done they could be done in one pass. There isn't a function that does this yet though. > >>Yes... although for the problem at hand that motivated my query, my > >>times matrix is symmetric... I don't really need all the minima, but > >>does numarray have any special datatype for symmetric matrixes, that > >>prevents storage of unneded (e.g. supradiagonal) elements?. > >> > > > > Not for special cases like this. One could probably write a special > > subclass to do this, but for a savings of a factor of 2 in memory, > > it usually would not be worth the trouble (unlike sparse matrices) > > OTOH, having subclasses that follow LAPACK's symmetric packed storage > scheme would be very useful not because of the space factor but the time > saved by being able to use the symmetric algorithms in LAPACK. I think. > I'd agree that that would be a much stronger motivating factor. Perry |
From: Russell E O. <rowen@u.washington.edu> - 2004-05-13 19:23:00
|
At 9:27 AM -0400 2004-05-13, Perry Greenfield wrote: >... One has to trade off the number of such functions >against the speed savings. Another example is getting max and min values >for an array. I've long thought that this is so often done they could >be done in one pass. There isn't a function that does this yet though. Statistics is another area where multiple return values could be of interest -- one may want the mean and std dev, and making two passes is wasteful (since some of the same info needs to be computed both times). A do-all function that computes min, min location, max, max location, mean and std dev all at once would be nice (especially if the returned values were accessed by name, rather than just being a tuple of values, so they could be referenced safely and readably). -- Russell |
From: Perry G. <pe...@st...> - 2004-05-13 19:42:44
|
> Russell E Owen wrote: > > At 9:27 AM -0400 2004-05-13, Perry Greenfield wrote: > >... One has to trade off the number of such functions > >against the speed savings. Another example is getting max and min values > >for an array. I've long thought that this is so often done they could > >be done in one pass. There isn't a function that does this yet though. > > Statistics is another area where multiple return values could be of > interest -- one may want the mean and std dev, and making two passes > is wasteful (since some of the same info needs to be computed both > times). > > A do-all function that computes min, min location, max, max location, > mean and std dev all at once would be nice (especially if the > returned values were accessed by name, rather than just being a tuple > of values, so they could be referenced safely and readably). > > -- Russell > We will definitely add something like this for 1.0 or 1.1. (but probably for min and max location, it will just be for the first encountered). Perry |
From: Perry G. <pe...@st...> - 2004-05-13 21:40:11
|
> > A do-all function that computes min, min location, max, max location, > > mean and std dev all at once would be nice (especially if the > > returned values were accessed by name, rather than just being a tuple > > of values, so they could be referenced safely and readably). > > > > -- Russell > > > We will definitely add something like this for 1.0 or 1.1. > (but probably for min and max location, it will just be > for the first encountered). > > Perry > To elaborate on this a bit, I'm thinking that the minmax capability probably should be separated from statistics (it's very common to want to do min max without wanting to do mean or stdev, plus as others have noted, the statistics can be a bit more involved). There is one other aspect that needs user input. How to handle ieee special values for things like minmax. Right now the ufunc reductions have some odd behavior that is a result of underlying C library behavior. A NaN isn't consistently handled in such comparisons (it appears to depend on the order when comparing a NaN to a regular float, whichever appears second appears to be accepted in pairwise comparisons). I was thinking that one could add an exclude keyword to the functions to indicate that ieee special values should not be included (well, I suppose Inf should be for min max) otherwise they would be. Any thoughts on this? Perry |
From: Warren F. <fo...@sl...> - 2004-05-13 21:09:01
|
On Thu, 13 May 2004, Russell E Owen wrote: > Statistics is another area where multiple return values could be of > interest -- one may want the mean and std dev, and making two passes > is wasteful (since some of the same info needs to be computed both > times). Single pass std deviations don't work very well if you've got a lot of data points and the std deviation is small compared to the average. I'm not arguing aginst including them, but maybe the documentation for such a function should include a caveat. Warren Focke |
From: Jon S. <js...@wm...> - 2004-05-14 10:03:54
|
What about an object (TimeSeries) which can be "explored" (optionally) during its init method and, if so, it creates the dictionary with those values? If it is not explored, the dictionary would be assigned to None and, if requested, the "exploratory" statistics would be computed then. This could be the basis for other computations on time series. Just my two cents. Jon Saenz. | Tfno: +34 946012445 Depto. Fisica Aplicada II | Fax: +34 946013500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080 - Bilbao \\ SPAIN On Thu, 13 May 2004, Russell E Owen wrote: > At 9:27 AM -0400 2004-05-13, Perry Greenfield wrote: > >... One has to trade off the number of such functions > >against the speed savings. Another example is getting max and min values > >for an array. I've long thought that this is so often done they could > >be done in one pass. There isn't a function that does this yet though. > > Statistics is another area where multiple return values could be of > interest -- one may want the mean and std dev, and making two passes > is wasteful (since some of the same info needs to be computed both > times). > > A do-all function that computes min, min location, max, max location, > mean and std dev all at once would be nice (especially if the > returned values were accessed by name, rather than just being a tuple > of values, so they could be referenced safely and readably). > > -- Russell > > > ------------------------------------------------------- > This SF.Net email is sponsored by: SourceForge.net Broadband > Sign-up now for SourceForge Broadband and get the fastest > 6.0/768 connection for only $19.95/mo for the first 3 months! > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Tim H. <tim...@co...> - 2004-05-12 20:13:48
|
Álvaro Tejero Cantero wrote: >Hello, > >I am new to numarray. I have been searching the documentation, and see >no evident vectorial and concise way to get the indexes for the minimum >of an array times.min(). > > If I understand your question correctly, you want argmin: >>> import numarray as na >>> from numarray import random_array >>> times = random_array.randint(0, 16, [4,4]) >>> times array([[15, 3, 11, 10], [ 1, 1, 15, 7], [ 4, 3, 5, 6], [10, 15, 12, 3]]) >>> na.argmin(times) # Takes min along axis 0 array([1, 1, 2, 3]) >>> na.argmin(times, 1) # Take min along axis 1 array([1, 1, 1, 3]) -tim >Currently my two alternatives are: (times is NxN array) > >husband, wife = nonzero(equal(times.min(),times)) >#gives tuple of 1x1 arrays, each containing one index > >and (even uglier) > >husband = compress(times.min()==times,indices([N,N])[0]) >wife = compress(times.min()==times,indices([N,N])[1]) > >These are weird ways to get something as simple. I am surely missing >something, but I have tried several slicing strategies before without >success. > >For getting the minimum times in each row I use: > >choose(argmin(times),transpose(times)) > >What are the idioms in numpy for these tasks? > > >Thank you very much in advance, á. > >I would > > > |
From: Tim H. <tim...@co...> - 2004-05-12 21:33:48
|
Álvaro Tejero Cantero wrote: >Hello, > > > >>To find the index of ONE minimum value is easy, >>though it is buried in the nd_image sub-package >>(where many users might miss it): >>numarray.nd_image.minimum_position >> >> > >It works great... but what about efficiency? If I do times.min() and >then numarray.nd_image.minimum_positioan(times) I am running twice >essentially the same extremum-finding routine, which is prohibitibe for >large N..., am I right? > >Which makes me thing of a more general question: I know that some of the >array functions are coded in C for speed, but what about the classical >python-for loop, as in (r Nx3 array of particle positions) > >[ [r[i]-r[j] for i in arange(N)] for j in arange(N)] > >is this handled to C code? > > Try this: >>> import numarray as na >>> r = na.arange(5) >>> na.subtract.outer(r, r) array([[ 0, -1, -2, -3, -4], [ 1, 0, -1, -2, -3], [ 2, 1, 0, -1, -2], [ 3, 2, 1, 0, -1], [ 4, 3, 2, 1, 0]]) Look up the special methods on ufuncs (outer, reduce, etc) for more details. >>I do not know a clean way to find all locations >>of the minimum value. I hope somebody else does. >> >> > >Yes... although for the problem at hand that motivated my query, my >times matrix is symmetric... I don't really need all the minima, but >does numarray have any special datatype for symmetric matrixes, that >prevents storage of unneded (e.g. supradiagonal) elements?. > > Not that I know of. -tim > >Thank you very much, I'm on my way to get some beautiful code out of old >fortranisms > >á. > > |
From: T. C. <al...@an...> - 2004-05-13 10:31:11
|
Hello, (for background, I am trying to get an array of interparticle distances, in which calculation of supradiagonal elements is unneeded because of symmetry. I am not doing anything about this now, though). >>> import numarray.random_array as rdn >>> N, D =3D 1000, 3 #number of particles and dimensions of space >>> r =3D rnd.random([N,D]) # r[i] gives the D coordinates of particle i # r[:,0] gives the all the x coordinates What I was doing: >>> r_rel =3D [[r[i]-r[j] for i in arange(N)] for j in arange N] now Tim says: > Try this: >=20 > >>> import numarray as na > >>> r =3D na.arange(5) > >>> na.subtract.outer(r, r) > array([[ 0, -1, -2, -3, -4], > [ 1, 0, -1, -2, -3], > [ 2, 1, 0, -1, -2], > [ 3, 2, 1, 0, -1], > [ 4, 3, 2, 1, 0]]) >=20 but this gives >>> subtract.outer(r,r).shape (10, 3, 10, 3) that is, subtracts y coordinates to x coordinates which is not intended. AFAIK the outer solution is MUCH faster than the nested for loops, so what I do now is >>> r_rel =3D transpose(array([subtract.outer(r[:,0],r[:,0]), subtract.outer(r[:,1],r[:,1]),=20 subtract.outer(r[:,2],r[:,2])])) >>> r_rel.shape #as with the double loop=20 (10,10,3)=20 My question is then if there is any more elegant way to do this, especially giving as a result independence of the number of dimensions). Maybe an "axis" (=3D0 in this case?) keyword for the outer function would be useful in this context? Thanks for the helpful welcome to the list!, =E1. PS. the problem with the min() function regarded a matrix of collision times between particles, which is symmetrical. The elements are reals, so I only expect to find one minimum. --=20 =C1lvaro Tejero Cantero http://alqua.org -- documentos libres free documents |
From: Perry G. <pe...@st...> - 2004-05-13 21:30:23
|
Álvaro Tejero Cantero wrote: > > but this gives > >>> subtract.outer(r,r).shape > (10, 3, 10, 3) > > that is, subtracts y coordinates to x coordinates which is not intended. > AFAIK the outer solution is MUCH faster than the nested for loops, so > what I do now is > > >>> r_rel = transpose(array([subtract.outer(r[:,0],r[:,0]), > subtract.outer(r[:,1],r[:,1]), > subtract.outer(r[:,2],r[:,2])])) > >>> r_rel.shape #as with the double loop > (10,10,3) > > > My question is then if there is any more elegant way to do this, > especially giving as a result independence of the number of dimensions). > Not that I can think of at the moment. > Maybe an "axis" (=0 in this case?) keyword for the outer function would > be useful in this context? > Perhaps. Right now these ufunc methods are pretty complicated so it may not be easy to do, but I understand that there is certainly utility in being able to do that. We'll look into it (but not right away so don't hold your breath). Perry |