From: CL WU <ane...@ho...> - 2003-09-17 20:02:11
|
Hi, group, I am new to numpy. I have 2 questions for array sort. 1. How to sort an array by its one column or one row? I know python build-in sort() can do it for list by passing own cmp function. but array function sort() will sort each column or row seperately,as I know. I don't want to convert array to list to sort and then convert back to array. 2. How to get the rank of a rank-0 array? The first "rank" means the order of each element after sorting, instead of the "dimension" meaning in numpy. Just like "rank()" function in splus. Thank you Chunlei |
From: Tim H. <tim...@ie...> - 2003-09-17 20:18:58
|
CL WU wrote: > Hi, group, > I am new to numpy. I have 2 questions for array sort. > > 1. How to sort an array by its one column or one row? > I know python build-in sort() can do it for list by passing own cmp > function. but array function sort() will sort each column or row > seperately,as I know. I don't want to convert array to list to sort > and then convert back to array. I think you want argsort plus take. For example, the following sorts on the second column of a: a = array([[4,5,6], [1,2,3], [7,8,9]]) arg = argsort(a[:,1]) take(a, arg, 0) > 2. How to get the rank of a rank-0 array? The first "rank" means the > order of each element after sorting, instead of the "dimension" > meaning in numpy. Just like "rank()" function in splus. If I understand you correctly, you want argsort as mentioned above. Regards, -tim > > Thank you > > Chunlei > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: CL WU <ane...@ho...> - 2003-09-17 20:54:27
|
Thank you, Tim. argsort() and take() does provide a easy way to sort an array based on any col or row. But for the second question, it doesn't return the result I want. As below, softrank or softrank1 are functions I am currently using for get the rank of a vector(first is more efficient). It returns the index of each value from original array/list in sorted array/list. I hope there is an efficient function in array level to do the same work. >>> from Numeric import * >>> a=array([5,2,3]) >>> argsort(a) array([1, 2, 0]) >>> def sortrank(list): ... n=len(list) ... li_a=[(i,list[i]) for i in range(n)] ... li_a.sort(lambda a,b:cmp(a[1],b[1])) ... li_b=[(i,li_a[i]) for i in range(n)] ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) ... return [x[0] for x in li_b] ... >>> sortrank(a) [2, 0, 1] >>> def sortrank2(li): ... li_sorted=li[:] ... li_sorted.sort() ... return [li_sorted.index(x) for x in li] >>> sortrank1(list(a)) [2, 0, 1] Thanks again. Chunlei Tim Hochberg wrote: > CL WU wrote: > >> Hi, group, >> I am new to numpy. I have 2 questions for array sort. >> >> 1. How to sort an array by its one column or one row? >> I know python build-in sort() can do it for list by passing own >> cmp function. but array function sort() will sort each column or row >> seperately,as I know. I don't want to convert array to list to sort >> and then convert back to array. > > > I think you want argsort plus take. For example, the following sorts > on the second column of a: > > a = array([[4,5,6], [1,2,3], [7,8,9]]) > arg = argsort(a[:,1]) > take(a, arg, 0) > >> 2. How to get the rank of a rank-0 array? The first "rank" means the >> order of each element after sorting, instead of the "dimension" >> meaning in numpy. Just like "rank()" function in splus. > > > If I understand you correctly, you want argsort as mentioned above. > > Regards, > > -tim > > >> >> Thank you >> >> Chunlei >> >> >> >> >> ------------------------------------------------------- >> This sf.net email is sponsored by:ThinkGeek >> Welcome to geek heaven. >> http://thinkgeek.com/sf >> _______________________________________________ >> Numpy-discussion mailing list >> Num...@li... >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> > > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: CL WU <ane...@ho...> - 2003-09-17 22:27:48
|
Great! It works much more efficiently. Thank you so much. Best, Chunlei Tim Hochberg wrote: > CL WU wrote: > >> Thank you, Tim. argsort() and take() does provide a easy way to sort >> an array based on any col or row. But for the second question, it >> doesn't return the result I want. >> As below, softrank or softrank1 are functions I am currently using >> for get the rank of a vector(first is more efficient). It returns the >> index of each value from original array/list in sorted array/list. > > > Hmmm. It seems that argsort and sortrank are inverses of a sort, so it > should be possible to do what you want efficiently, but I'm not sure how. > > <think> > > Ah, it appears to be quite simple. I believe: > > argsort(argsort(a)) > > is equivalent to your sortrank and should be much faster. > > regards, > > -tim > > >> I hope there is an efficient function in array level to do the same >> work. >> >>> from Numeric import * >> >>> a=array([5,2,3]) >> >>> argsort(a) >> array([1, 2, 0]) >> >>> def sortrank(list): >> ... n=len(list) >> ... li_a=[(i,list[i]) for i in range(n)] >> ... li_a.sort(lambda a,b:cmp(a[1],b[1])) >> ... li_b=[(i,li_a[i]) for i in range(n)] >> ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) >> ... return [x[0] for x in li_b] >> ... >>> sortrank(a) >> [2, 0, 1] >> >>> def sortrank2(li): >> ... li_sorted=li[:] >> ... li_sorted.sort() >> ... return [li_sorted.index(x) for x in li] >> >>> sortrank1(list(a)) >> [2, 0, 1] >> >> >> Thanks again. >> >> Chunlei >> >> Tim Hochberg wrote: >> >>> CL WU wrote: >>> >>>> Hi, group, >>>> I am new to numpy. I have 2 questions for array sort. >>>> >>>> 1. How to sort an array by its one column or one row? >>>> I know python build-in sort() can do it for list by passing own >>>> cmp function. but array function sort() will sort each column or >>>> row seperately,as I know. I don't want to convert array to list to >>>> sort and then convert back to array. >>> >>> >>> >>> >>> I think you want argsort plus take. For example, the following sorts >>> on the second column of a: >>> >>> a = array([[4,5,6], [1,2,3], [7,8,9]]) >>> arg = argsort(a[:,1]) >>> take(a, arg, 0) >>> >>>> 2. How to get the rank of a rank-0 array? The first "rank" means >>>> the order of each element after sorting, instead of the "dimension" >>>> meaning in numpy. Just like "rank()" function in splus. >>> >>> >>> >>> >>> If I understand you correctly, you want argsort as mentioned above. >>> >>> Regards, >>> >>> -tim >>> >>> >>>> >>>> Thank you >>>> >>>> Chunlei >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> This sf.net email is sponsored by:ThinkGeek >>>> Welcome to geek heaven. >>>> http://thinkgeek.com/sf >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Num...@li... >>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------- >>> This sf.net email is sponsored by:ThinkGeek >>> Welcome to geek heaven. >>> http://thinkgeek.com/sf >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Num...@li... >>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >> >> >> >> >> ------------------------------------------------------- >> This sf.net email is sponsored by:ThinkGeek >> Welcome to geek heaven. >> http://thinkgeek.com/sf >> _______________________________________________ >> Numpy-discussion mailing list >> Num...@li... >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> > > > > |
From: Tim H. <tim...@ie...> - 2003-09-18 18:33:18
|
Hi Chunlei, I just realized one other thing that you should probably be aware of. You could write a much faster version of sortrank in pure python by doing your sorts differently. Python's built in sort is very fast, but as soon as you start passing in comparison functions it slows down dramatically. The trick is to arange the data you need to sort so that you don't need an auxilliary function (know asDecorate-Sort-Undecorate or the Schwartzian transform). Thus, the following is almost certainly a lot faster than your original sortrank, although probably still slower than the argsort solution. def sortrank(list): index = range(len(list)) li_a = zip(list, index) li_a.sort() li_b = [(li_a[i][1], i) for i in index] li_b.sort() return [x[1] for x in li_b] Regards, -tim CL WU wrote: >>> I hope there is an efficient function in array level to do the same >>> work. >>> >>> from Numeric import * >>> >>> a=array([5,2,3]) >>> >>> argsort(a) >>> array([1, 2, 0]) >>> >>> def sortrank(list): >>> ... n=len(list) >>> ... li_a=[(i,list[i]) for i in range(n)] >>> ... li_a.sort(lambda a,b:cmp(a[1],b[1])) >>> ... li_b=[(i,li_a[i]) for i in range(n)] >>> ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) >>> ... return [x[0] for x in li_b] >>> ... >>> sortrank(a) >>> [2, 0, 1] >>> >>> def sortrank2(li): >>> ... li_sorted=li[:] >>> ... li_sorted.sort() >>> ... return [li_sorted.index(x) for x in li] >>> >>> sortrank1(list(a)) >>> [2, 0, 1] >>> >>> >>> Thanks again. >>> >>> Chunlei >>> >>> Tim Hochberg wrote: >>> >>>> CL WU wrote: >>>> >>>>> Hi, group, >>>>> I am new to numpy. I have 2 questions for array sort. >>>>> >>>>> 1. How to sort an array by its one column or one row? >>>>> I know python build-in sort() can do it for list by passing own >>>>> cmp function. but array function sort() will sort each column or >>>>> row seperately,as I know. I don't want to convert array to list to >>>>> sort and then convert back to array. >>>> >>>> >>>> >>>> >>>> >>>> I think you want argsort plus take. For example, the following >>>> sorts on the second column of a: >>>> >>>> a = array([[4,5,6], [1,2,3], [7,8,9]]) >>>> arg = argsort(a[:,1]) >>>> take(a, arg, 0) >>>> >>>>> 2. How to get the rank of a rank-0 array? The first "rank" means >>>>> the order of each element after sorting, instead of the >>>>> "dimension" meaning in numpy. Just like "rank()" function in >>>>> splus. >>>> >>>> >>>> >>>> >>>> >>>> If I understand you correctly, you want argsort as mentioned above. >>>> >>>> Regards, >>>> >>>> -tim >>>> >>>> >>>>> >>>>> Thank you >>>>> >>>>> Chunlei >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> This sf.net email is sponsored by:ThinkGeek >>>>> Welcome to geek heaven. >>>>> http://thinkgeek.com/sf >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Num...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> This sf.net email is sponsored by:ThinkGeek >>>> Welcome to geek heaven. >>>> http://thinkgeek.com/sf >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Num...@li... >>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>> >>> >>> >>> >>> >>> ------------------------------------------------------- >>> This sf.net email is sponsored by:ThinkGeek >>> Welcome to geek heaven. >>> http://thinkgeek.com/sf >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Num...@li... >>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >> >> >> >> > > > |
From: CL WU <ane...@ho...> - 2003-09-18 21:00:52
|
Thanks again, Tim. It a wonderful example to show how efficient python can run if it is well written. Best, Chunlei Tim Hochberg wrote: > Hi Chunlei, > > I just realized one other thing that you should probably be aware of. > You could write a much faster version of sortrank in pure python by > doing your sorts differently. Python's built in sort is very fast, but > as soon as you start passing in comparison functions it slows down > dramatically. The trick is to arange the data you need to sort so that > you don't need an auxilliary function (know > asDecorate-Sort-Undecorate or the Schwartzian transform). Thus, the > following is almost certainly a lot faster than your original > sortrank, although probably still slower than the argsort solution. > > def sortrank(list): > index = range(len(list)) > li_a = zip(list, index) > li_a.sort() > li_b = [(li_a[i][1], i) for i in index] > li_b.sort() > return [x[1] for x in li_b] > > Regards, > > -tim > > > > > > > CL WU wrote: > >>>> I hope there is an efficient function in array level to do the same >>>> work. >>>> >>> from Numeric import * >>>> >>> a=array([5,2,3]) >>>> >>> argsort(a) >>>> array([1, 2, 0]) >>>> >>> def sortrank(list): >>>> ... n=len(list) >>>> ... li_a=[(i,list[i]) for i in range(n)] >>>> ... li_a.sort(lambda a,b:cmp(a[1],b[1])) >>>> ... li_b=[(i,li_a[i]) for i in range(n)] >>>> ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) >>>> ... return [x[0] for x in li_b] >>>> ... >>> sortrank(a) >>>> [2, 0, 1] >>>> >>> def sortrank2(li): >>>> ... li_sorted=li[:] >>>> ... li_sorted.sort() >>>> ... return [li_sorted.index(x) for x in li] >>>> >>> sortrank1(list(a)) >>>> [2, 0, 1] >>>> >>>> >>>> Thanks again. >>>> >>>> Chunlei >>>> >>>> Tim Hochberg wrote: >>>> >>>>> CL WU wrote: >>>>> >>>>>> Hi, group, >>>>>> I am new to numpy. I have 2 questions for array sort. >>>>>> >>>>>> 1. How to sort an array by its one column or one row? >>>>>> I know python build-in sort() can do it for list by passing >>>>>> own cmp function. but array function sort() will sort each column >>>>>> or row seperately,as I know. I don't want to convert array to >>>>>> list to sort and then convert back to array. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I think you want argsort plus take. For example, the following >>>>> sorts on the second column of a: >>>>> >>>>> a = array([[4,5,6], [1,2,3], [7,8,9]]) >>>>> arg = argsort(a[:,1]) >>>>> take(a, arg, 0) >>>>> >>>>>> 2. How to get the rank of a rank-0 array? The first "rank" means >>>>>> the order of each element after sorting, instead of the >>>>>> "dimension" meaning in numpy. Just like "rank()" function in >>>>>> splus. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> If I understand you correctly, you want argsort as mentioned above. >>>>> >>>>> Regards, >>>>> >>>>> -tim >>>>> >>>>> >>>>>> >>>>>> Thank you >>>>>> >>>>>> Chunlei >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------- >>>>>> This sf.net email is sponsored by:ThinkGeek >>>>>> Welcome to geek heaven. >>>>>> http://thinkgeek.com/sf >>>>>> _______________________________________________ >>>>>> Numpy-discussion mailing list >>>>>> Num...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> This sf.net email is sponsored by:ThinkGeek >>>>> Welcome to geek heaven. >>>>> http://thinkgeek.com/sf >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Num...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> This sf.net email is sponsored by:ThinkGeek >>>> Welcome to geek heaven. >>>> http://thinkgeek.com/sf >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Num...@li... >>>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>> >>> >>> >>> >>> >> >> >> > > > > |
From: Tim H. <tim...@ie...> - 2003-09-18 17:53:47
|
I'm just starting to move some of my code over to numarray and I was dismayed to find that basic operation between Numeric and numarray arrays fail. >>> import Numeric as np >>> import numarray as na >>> a = na.arange(5) >>> p = np.arange(5) >>> a + p ['vector', 'vector'] Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\numarraycore.py", line 648, in __add__ def __add__(self, operand): return ufunc.add(self, operand) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 818, in _cache_miss2 key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) KeyError: '_digest force cache miss' I suspect (hope!) that this is just a bug and not something inherent in numarray. I dug around in unfunc.py a bit and it appears that the bug is shallow and can be fixed simply by replacing:: if not (_sequence(n1) or _sequence(n2)): key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) self._cache[ key ] = cached with:: try: key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) except KeyError: pass else: self._cache[ key ] = cached in _cache_miss2 and _cache_miss1. If this were done, _sequence could probably be deleted as well. I'm not very familiar with the numarray code yet, so it's quite possible I'm missing something, but I'm willing to do more digging to fix this if this turns out to not be sufficient. Regards, -tim |
From: Todd M. <jm...@st...> - 2003-09-18 18:07:54
|
On Thu, 2003-09-18 at 13:53, Tim Hochberg wrote: > > I'm just starting to move some of my code over to numarray and I was > dismayed to find that basic operation between Numeric and numarray > arrays fail. > > >>> import Numeric as np > >>> import numarray as na > >>> a = na.arange(5) > >>> p = np.arange(5) > >>> a + p > ['vector', 'vector'] > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "C:\Python23\Lib\site-packages\numarray\numarraycore.py", line > 648, in __add__ > def __add__(self, operand): return ufunc.add(self, operand) > File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 818, in > _cache_miss2 > key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) > KeyError: '_digest force cache miss' > > I suspect (hope!) that this is just a bug and not something inherent in > numarray. It's an interoperability issue. Please let us know if you find others. > I dug around in unfunc.py a bit and it appears that the bug is > shallow and can be fixed simply by replacing:: > > if not (_sequence(n1) or _sequence(n2)): > key = (_digest(n1), _digest(n2), _digest(out), > safethread.get_ident()) > self._cache[ key ] = cached > > with:: > > try: > key = (_digest(n1), _digest(n2), _digest(out), > safethread.get_ident()) > except KeyError: > pass > else: > self._cache[ key ] = cached > > in _cache_miss2 and _cache_miss1. If this were done, _sequence could > probably be deleted as well. > > I'm not very familiar with the numarray code yet, so it's quite possible > I'm missing something, but I'm willing to do more digging to fix this if > this turns out to not be sufficient. > I ran into the same problem trying to port MA to numarray, and came up with an identical work around. A fix like this will be part of numarray-0.8. Todd -- Todd Miller jm...@st... STSCI / ESS / SSB |