From: Todd M. <jm...@st...> - 2004-05-20 17:13:44
|
On Thu, 2004-05-20 at 05:26, Robert Kern wrote: > Francesc Alted wrote: > > Hi, > > > > I'm willing to use a lot the searchsorted function in numarray, but I'm a > > bit surprised about the poor performance of it. Look at that: > > > > > >>>>from time import time > >>>>import numarray > >>>>import Numeric > >>>>na=numarray.arange(1000*1000) > >>>>nu=Numeric.arange(1000*1000) > >>>>t1=time();numarray.searchsorted(na,200*1000);time()-t1 > > > > 200000 > > 0.00055098533630371094 > > > >>>>t1=time();Numeric.searchsorted(nu,200*1000);time()-t1 > > > > 200000 > > 7.7962875366210938e-05 > > > > It may seem that Numeric is better optimised, but the standard python module > > bisect is even faster than numarray.searchsorted: > > > > > >>>>import bisect > >>>>t1=time();bisect.bisect_left(na,200*1000);time()-t1 > > > > 200000 > > 8.8930130004882812e-05 > > > >>>>t1=time();bisect.bisect_left(nu,200*1000);time()-t1 > > > > 200000 > > 8.6069107055664062e-05 > > A better timing (IMHO), but with similar conclusions: > > Python 2.3 (#1, Sep 13 2003, 00:49:11) > [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import timeit > >>> t1 = timeit.Timer("Numeric.searchsorted(a,200000)", > "import Numeric;a=Numeric.arange(1000000)") > >>> t2 = timeit.Timer("numarray.searchsorted(a,200000)", > "import numarray;a=numarray.arange(1000000)") > >>> t3 = timeit.Timer("bisect.bisect_left(a,200000)", > "import Numeric;import bisect;a=Numeric.arange(1000000)") > >>> t4 = timeit.Timer("bisect.bisect_left(a,200000)", > "import numarray;import bisect;a=numarray.arange(1000000)") > >>> t1.repeat(3,10000) > [0.15758609771728516, 0.17469501495361328, 0.15456986427307129] > >>> t2.repeat(3,10000) > [6.7581729888916016, 6.9644770622253418, 6.6776731014251709] > >>> t3.repeat(3,10000) > [0.41335701942443848, 0.45698308944702148, 0.39665889739990234] > >>> t4.repeat(3,10000) > [0.49930000305175781, 0.48063492774963379, 0.52067780494689941] > > [Apologies for the linewraps.] > > I also get similar results with double arrays. Weird. > > Python 2.3 on Mac OS X 10.3.sumthin', latest CVS checkout of numarray, > Numeric 23.1. Here's what I got with the numarray benchmarks after adding an extra case: benchmark i numarray (usec) Numeric (usec) numarray:Numeric searchsorted(a,p/5) 0.0 446 12 36.7 1.0 438 11 36.8 2.0 450 12 35.0 3.0 459 14 32.6 4.0 511 25 19.7 5.0 636 56 11.2 6.0 653 64 10.1 searchsorted(a,a) 0.0 285 5 48.0 1.0 283 7 39.6 2.0 291 29 9.8 3.0 368 308 1.2 4.0 1771 4120 0.4 5.0 17335 52127 0.3 6.0 201277 605787 0.3 p = 10**i a = arange(p) The first case agrees with your results of ~10x difference in favor of Numeric at 10**6 elements. The last case shows a ~3x numarray advantage given large numbers of values. My analysis is this: since searchsorted runs in O(log2(N)) time, even with 10**6 elements there are only 20 iterations or so. This is just not enough to overcome numarray's Python ufunc overhead. I'll see what I can do, but I think we're up against the standard numarray performance wall for small arrays. Regards, Todd |