From: Tim H. <tim...@ie...> - 2001-06-19 15:54:20
|
Hi Berthold, I tested your code on Win95 using Numeric 20.0.0 and got essentially the same pattern of times. So, upgrading your version of Numeric is not likely to help. Curious, I checked out the code for sort and found that it justs calls qsort from the C library. I suspect that the problem is related to quicksort having bad worst case behaviour. Not being a computer scientest, I can't tell you under what situations the bad behaviour is triggered, although I know it doesn't like presorted lists. Anyway, if you're using 1D arrays, one workaround would be to use list.sort. Python's sorting routines has, I understand, lots of gimmicks to avoid the problems that quicksort sometimes encounters. I tried it and this: r=(random((71400,))*7).astype(Int) l = r.tolist() l.sort() p = array(l) runs about 40 times faster than this: r=(random((71400,))*7).astype(Int) p = r.sort() If you don't need to convert to and from a array, this approach is 60 times faster. Even if you're dealing with a multidimensional array, this approach (in a loop) might be signifigantly faster assuming you're sorting along the long axis. It makes one wonder if using the python sort rather than qsort for Numeric.sort would be a profitable change. No time to investigate it right now though. Hope that's useful... -tim > We have a speed problem with Numeric.sort on large arrays with only a > few different values. Here is my example > > -- snip -- > > >cat numtst.py > import Numeric > print Numeric.__version__ > class timer: > def __init__(self): > import time > self.start = time.time() > def stop(self): > import time > print "%.3f" % (time.time() - self.start) > > from RandomArray import random > from Numeric import sort, Int > > r=random((71400,)) > t = timer() ; p=sort(r) ; t.stop() > r=(random((71400,))*70000).astype(Int) > t = timer() ; p=sort(r) ; t.stop() > r=(random((71400,))*70).astype(Int) > t = timer() ; p=sort(r) ; t.stop() > r=(random((71400,))*7).astype(Int) > t = timer() ; p=sort(r) ; t.stop() > 16:27 hoel@seeve:hoel 2>python numtst.py > 17.3.0 > 0.185 > 0.148 > 2.053 > 21.668 > > -- snip -- > > So the less different values are contained in the array the longer > takes the sorting. Is this also the case with newer versions of > Numeric (But this is Python 1.5.2)? Why is sorting of these arrays so > slow? > > Thanks > > Berthold > -- > email: hoel@GermanLloyd.org > ) tel. : +49 (40) 3 61 49 - 73 74 > ( > C[_] These opinions might be mine, but never those of my employer. > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > http://lists.sourceforge.net/lists/listinfo/numpy-discussion |