You can subscribe to this list here.
| 2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
| 2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
| 2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
| 2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
| 2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
| 2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
|
From: Colin J. W. <cj...@sy...> - 2006-02-22 15:29:00
|
I've been trying to gain some understanding of dtype from the builtin documentation and would appreciate advice. I don't find anything in http://projects.scipy.org/scipy/numpy or http://wiki.python.org/moin/NumPy Chapter 2.1 of the book has a good overview, but little reference material. In the following, dt= numpy.dtype Some specific problems are flagged ** below. Colin W. [Dbg]>>> h(dt) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder(<endian>) returns a copy of the dtype object | with altered byteorders. If <endian> is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = <built-in method __new__ of type object> | T.__new__(S, ...) -> a new object with type S, a subtype of T ** What are the parameters? In other words, | what does ... stand for? ** | | alignment = <member 'alignment' of 'numpy.dtype' objects> | | | base = <attribute 'base' of 'numpy.dtype' objects> | The base data-type or self if no subdtype | | byteorder = <member 'byteorder' of 'numpy.dtype' objects> | | | char = <member 'char' of 'numpy.dtype' objects> | | | descr = <attribute 'descr' of 'numpy.dtype' objects> | The array_protocol type descriptor. | | fields = <attribute 'fields' of 'numpy.dtype' objects> | | | hasobject = <member 'hasobject' of 'numpy.dtype' objects> | | | isbuiltin = <attribute 'isbuiltin' of 'numpy.dtype' objects> | Is this a buillt-in data-type descriptor? | | isnative = <attribute 'isnative' of 'numpy.dtype' objects> | Is the byte-order of this descriptor native? | | itemsize = <member 'itemsize' of 'numpy.dtype' objects> | | | kind = <member 'kind' of 'numpy.dtype' objects> | | | name = <attribute 'name' of 'numpy.dtype' objects> | The name of the true data-type | | num = <member 'num' of 'numpy.dtype' objects> | | | shape = <attribute 'shape' of 'numpy.dtype' objects> | The shape of the subdtype or (1,) | | str = <attribute 'str' of 'numpy.dtype' objects> | The array_protocol typestring. | | subdtype = <attribute 'subdtype' of 'numpy.dtype' objects> | A tuple of (descr, shape) or None. | | type = <member 'type' of 'numpy.dtype' objects> [Dbg]>>> dt.num.__doc__ ** no doc string ** [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = <member '__objclass__' of 'member_descriptor' objects> [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = <member '__objclass__' of 'member_descriptor' objects> [Dbg]>>> help(dt.num.__objclass__) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder(<endian>) returns a copy of the dtype object | with altered byteorders. If <endian> is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = <built-in method __new__ of type object> | T.__new__(S, ...) -> a new object with type S, a subtype of T | | alignment = <member 'alignment' of 'numpy.dtype' objects> | | | base = <attribute 'base' of 'numpy.dtype' objects> | The base data-type or self if no subdtype | | byteorder = <member 'byteorder' of 'numpy.dtype' objects> | | | char = <member 'char' of 'numpy.dtype' objects> | | | descr = <attribute 'descr' of 'numpy.dtype' objects> | The array_protocol type descriptor. | | fields = <attribute 'fields' of 'numpy.dtype' objects> | | | hasobject = <member 'hasobject' of 'numpy.dtype' objects> | | | isbuiltin = <attribute 'isbuiltin' of 'numpy.dtype' objects> | Is this a buillt-in data-type descriptor? | | isnative = <attribute 'isnative' of 'numpy.dtype' objects> | Is the byte-order of this descriptor native? | | itemsize = <member 'itemsize' of 'numpy.dtype' objects> | | | kind = <member 'kind' of 'numpy.dtype' objects> | | | name = <attribute 'name' of 'numpy.dtype' objects> | The name of the true data-type ** How does this differ from what, in common | Python usage, is a class.__name__? ** | | num = <member 'num' of 'numpy.dtype' objects> ** What does this mean? ** | | | shape = <attribute 'shape' of 'numpy.dtype' objects> | The shape of the subdtype or (1,) | | str = <attribute 'str' of 'numpy.dtype' objects> | The array_protocol typestring. | | subdtype = <attribute 'subdtype' of 'numpy.dtype' objects> | A tuple of (descr, shape) or None. | | type = <member 'type' of 'numpy.dtype' objects> [Dbg]>>> ** There is no __module__ attribute. How does one identify the modules holding the code? ** |
|
From: Sven S. <sve...@gm...> - 2006-02-22 14:47:23
|
Christopher Barker schrieb: > Sven Schreiber wrote: >> I guess I'd rather follow the advice and just remember to treat 1d as >> a row. > > Except that it's not, universally. For instance, it won't transpose: > > > It's very helpful to remember that indexing reduces rank, and slicing > keeps the rank the same. It will serve you well to use that in the > future anyway. > Anyway, the problem is really about interaction with pylab/matplotlib (so slightly OT here, sorry); when getting data from a text file with pylab.load you can't be sure if the result is 1d or 2d. This means that: - If I have >1 variable then everything is fine (provided I use your advice of slicing instead of indexing afterwards) and the variables are in the _columns_ of the 2d-array. - But if there's just one data _column_ in the file, then pylab/numpy gives me a 1d-array that sometimes works as a _row_ (and as you noted, sometimes not), but never works as a column. Imho that's bad, because as a consequence I must use overhead code to distinguish between these cases. To me it seems more like pylab's bug instead of numpy's, so please excuse this OT twist, but since there seems to be overlap between the pylab/matplotlib and numpy folks, maybe it's not so bad. Thanks for your patience and helpful input, Sven |
|
From: Bruce S. <bso...@gm...> - 2006-02-22 14:24:18
|
Hi, Actually it makes it slightly worse - given the responses on another thread it is probably due to not pushing enough into C code. Obviously use of blas etc will be faster but it doesn't change the fact that removing the inner loop would be faster still. Bruce On 2/22/06, Nadav Horesh <na...@vi...> wrote: > You may get a significant boost by replacing the line: > w=3Dw+ eta * (y*x - y**2*w) > with > w *=3D 1.0 - eta*y*y > w +=3D eta*y*x > > I ran a test on a similar expression and got 5 fold speed increase. > The dot() function runs faster if you compile with dotblas. > > Nadav. > > > -----Original Message----- > From: num...@li... on behalf of Bruce S= outhey > Sent: Tue 21-Feb-06 17:15 > To: Brian Blais > Cc: pyt...@py...; num...@li...; s= cip...@sc... > Subject: Re: [Numpy-discussion] algorithm, optimization, or other = problem? > Hi, > In the current version, note that Y is scalar so replace the squaring > (Y**2) with Y*Y as you do in the dohebb function. On my system > without blas etc removing the squaring removes a few seconds (16.28 to > 12.4). It did not seem to help factorizing Y. > > Also, eta and tau are constants so define them only once as scalars > outside the loops and do the division outside the loop. It only saves > about 0.2 seconds but these add up. > > The inner loop probably can be vectorized because it is just vector > operations on a matrix. You are just computing over the ith dimension > of X. I think that you could be able to find the matrix version on > the net. > > Regards > Bruce > > > > On 2/21/06, Brian Blais <bb...@br...> wrote: > > Hello, > > > > I am trying to translate some Matlab/mex code to Python, for doing neur= al > > simulations. This application is definitely computing-time limited, an= d I need to > > optimize at least one inner loop of the code, or perhaps even rethink t= he algorithm. > > The procedure is very simple, after initializing any variables: > > > > 1) select a random input vector, which I will call "x". right now I ha= ve it as an > > array, and I choose columns from that array randomly. in other cases, = I may need to > > take an image, select a patch, and then make that a column vector. > > > > 2) calculate an output value, which is the dot product of the "x" and a= weight > > vector, "w", so > > > > y=3Ddot(x,w) > > > > 3) modify the weight vector based on a matrix equation, like: > > > > w=3Dw+ eta * (y*x - y**2*w) > > ^ > > | > > +---- learning rate constant > > > > 4) repeat steps 1-3 many times > > > > I've organized it like: > > > > for e in 100: # outer loop > > for i in 1000: # inner loop > > (steps 1-3) > > > > display things. > > > > so that the bulk of the computation is in the inner loop, and is amenab= le to > > converting to a faster language. This is my issue: > > > > straight python, in the example posted below for 250000 inner-loop step= s, takes 20 > > seconds for each outer-loop step. I tried Pyrex, which should work ver= y fast on such > > a problem, takes about 8.5 seconds per outer-loop step. The same code = as a C-mex > > file in matlab takes 1.5 seconds per outer-loop step. > > > > Given the huge difference between the Pyrex and the Mex, I feel that th= ere is > > something I am doing wrong, because the C-code for both should run comp= arably. > > Perhaps the approach is wrong? I'm willing to take any suggestions! I= don't mind > > coding some in C, but the Python API seemed a bit challenging to me. > > > > One note: I am using the Numeric package, not numpy, only because I wan= t to be able > > to use the Enthought version for Windows. I develop on Linux, and have= n't had a > > chance to see if I can compile numpy using the Enthought Python for Win= dows. > > > > If there is anything else anyone needs to know, I'll post it. I put th= e main script, > > and a dohebb.pyx code below. > > > > > > thanks! > > > > Brian Blais > > > > -- > > ----------------- > > > > bb...@br... > > http://web.bryant.edu/~bblais > > > > > > > > > > # Main script: > > > > from dohebb import * > > import pylab as p > > from Numeric import * > > from RandomArray import * > > import time > > > > x=3Drandom((100,1000)) # 1000 input vectors > > > > numpats=3Dx.shape[0] > > w=3Drandom((numpats,1)); > > > > th=3Drandom((1,1)) > > > > params=3D{} > > params['eta']=3D0.001; > > params['tau']=3D100.0; > > old_mx=3D0; > > for e in range(100): > > > > rnd=3Drandint(0,numpats,250000) > > t1=3Dtime.time() > > if 0: # straight python > > for i in range(len(rnd)): > > pat=3Drnd[i] > > xx=3Dreshape(x[:,pat],(1,-1)) > > y=3Dmatrixmultiply(xx,w) > > w=3Dw+params['eta']*(y*transpose(xx)-y**2*w); > > th=3Dth+(1.0/params['tau'])*(y**2-th); > > else: # pyrex > > dohebb(params,w,th,x,rnd) > > print time.time()-t1 > > > > > > p.plot(w,'o-') > > p.xlabel('weights') > > p.show() > > > > > > #=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > # dohebb.pyx > > > > cdef extern from "Numeric/arrayobject.h": > > > > struct PyArray_Descr: > > int type_num, elsize > > char type > > > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > > cdef char *data > > cdef int nd > > cdef int *dimensions, *strides > > cdef object base > > cdef PyArray_Descr *descr > > cdef int flags > > > > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > > > > cdef int num_iterations > > cdef int num_inputs > > cdef int offset > > cdef double *wp,*xp,*thp > > cdef int *rndp > > cdef double eta,tau > > > > eta=3Dparams['eta'] # learning rate > > tau=3Dparams['tau'] # used for variance estimate > > > > cdef double y > > num_iterations=3Drnd.dimensions[0] > > num_inputs=3Dw.dimensions[0] > > > > # get the pointers > > wp=3D<double *>w.data > > xp=3D<double *>X.data > > rndp=3D<int *>rnd.data > > thp=3D<double *>th.data > > > > for it from 0 <=3D it < num_iterations: > > > > offset=3Drndp[it]*num_inputs > > > > # calculate the output > > y=3D0.0 > > for i from 0 <=3D i < num_inputs: > > y=3Dy+wp[i]*xp[i+offset] > > > > # change in the weights > > for i from 0 <=3D i < num_inputs: > > wp[i]=3Dwp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > > > # estimate the variance > > thp[0]=3Dthp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log = files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat= =3D121642 > > _______________________________________________ > > Numpy-discussion mailing list > > Num...@li... > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=103432&bid#0486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > |
|
From: <mf...@ae...> - 2006-02-22 14:06:24
|
I built Python successfully on our AIX 5.2 server using "./configure --without-cxx --disable-ipv6". (This uses the native IBM C compiler, invoking it as "cc_r". We have no C++ compiler.) But I have been unable to install Numpy-0.9.5 using the same compiler. After "python setup.py install," the relevant section of the output was: compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/pydirectory/include/python2.4 -c' cc_r: build/src/numpy/core/src/umathmodule.c "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. error: Command "cc_r -DNDEBUG -O -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/app/sandbox/s625662/installed/include/python2.4 -c build/src/numpy/core/src/umathmodule.c -o build/temp.aix-5.2-2.4 /build/src/numpy/core/src/umathmodule.o" failed with exit status 1 A closely related question is, how can I modify the Numpy setup.py and/or distutils files to enable me to control the options with which cc_r is invoked? I inspected these files, but not being very expert in Python, I could not figure this out. Mark F. Morss Principal Analyst, Market Risk American Electric Power |
|
From: Mads I. <mp...@os...> - 2006-02-22 11:55:31
|
On Tue, 21 Feb 2006, Tim Hochberg wrote: > Mads Ipsen wrote: > > >On Tue, 21 Feb 2006, Tim Hochberg wrote: > > > > > > > >>This all makes perfect sense, but what happended to box? In your > >>original code there was a step where you did some mumbo jumbo and box > >>and rint. Namely: > >> > >> > > > >It's a minor detail, but the reason for this is the following > > > >Suppose you have a line with length of box = 10 with periodic boundary > >conditions (basically this is a circle). Now consider two points x0 = 1 > >and x1 = 9 on this line. The shortest distance dx between the points x0 > >and x1 is dx = -2 and not 8. The calculation > > > > dx = x1 - x0 ( = +8) > > dx -= box*rint(dx/box) ( = -2) > > > >will give you the desired result, namely dx = -2. Hope this makes better > >sense. Note that fmod() won't work since > > > > fmod(dx,box) = 8 > > > > > I think you could use some variation like "fmod(dx+box/2, box) - box/2" > but rint seems better. > > >Part of my original post was concerned with the fact, that I initially was > >using around() from numpy for this step. This was terribly slow, so I made > >some custom changes and added rint() from the C-math library to the numpy > >module, giving a speedup factor about 4 for this particular line in the > >code. > > > >Best regards // Mads > > > > > > > OK, that all makes sense. You might want to try the following, which > factors out all the divisions and half the multiplies by box and > produces several fewer temporaries. Note I replaced x**2 with x*x, > which for the moment is much faster (I don't know if you've been > following the endless yacking about optimizing x**n, but x**2 will get > fast eventually). Depending on what you're doing with r2, you may be > able to avoid the last multiple by box as well. > > > # Loop over all particles > xbox = x/box > ybox = y/box > for i in range(n-1): > dx = xbox[i+1:] - xbox[i] > dy = ybox[i+1:] - ybox[i] > dx -= rint(dx) > dy -= rint(dy) > r2 = (dx*dx + dy*dy) > r2 *= box > > > > Regards, > > -tim > Thanks Tim, I am only a factor 2.5 slower than the C loop now, thanks to your suggestions. // Mads |
|
From: Ed S. <sch...@ft...> - 2006-02-22 10:47:55
|
Sasha wrote: >I propose to deprecate around and implement a new "round" member >function in C that will only accept scalar "decimals" and will behave >like a properly vectorized builtin round. I will do the coding if >there is interest. > >In any case, something has to be done here. I don't think the >following timings are acceptable: > > This sounds great to me :) -- Ed |
|
From: Mads I. <mp...@os...> - 2006-02-22 10:26:34
|
On Tue, 21 Feb 2006, Sasha wrote: > > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" > 100000 loops, best of 3: 18.8 usec per loop > > python -m timeit -s just want to point out that the function foo(x) = (x+0.5).astype(int).astype(float) is different from around. For x = array([1.2, 1.8]) it works but for x = array([-1.2, -1.8]) you get around(x) = array([-1., -2.]) whereas foo(x) gives foo(x) = array([0., -1.]) Using foo(x) = where(greater(x,0),x+0.5,x-0.5).astype(int).astype(float) will work. // Mads |
|
From: <hot...@ob...> - 2006-02-22 10:19:19
|
厳選したセレブ女性との付き合いはどうですか? 濡れに濡れまくる女性を何時でも何処でも完全無料での 紹介致します。 ご近所検索・写メール検索など、メンバーだけの豊富なサービスが たっぷり!納得いくまでじっくりとお相手探し。 スチュワーデス・看護婦・モデル・OL・主婦など大量在籍し ております。じっくりと貴方の理想のお相手をゲットして熱い夜を 過してください。 http://www.covcov.net?num=112 ただ今、直メ配信サービス実施中! 無料登録だけで女性より直接お返事が 貴方のメールBOXに届きます♪ 拒否:re...@ww... |
|
From: Zachary P. <zp...@st...> - 2006-02-22 08:49:32
|
Hello folks, Does numpy have an built-in mechanism to shift elements along some axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, to make [2,3,0,1]) If not, what would be the fastest way to implement this in python? Using take? Using slicing and concatenation? Zach |
|
From: Nadav H. <na...@vi...> - 2006-02-22 06:58:05
|
You may get a significant boost by replacing the line: w=3Dw+ eta * (y*x - y**2*w) with w *=3D 1.0 - eta*y*y w +=3D eta*y*x I ran a test on a similar expression and got 5 fold speed increase. The dot() function runs faster if you compile with dotblas. Nadav. -----Original Message----- From: num...@li... on behalf of Bruce = Southey Sent: Tue 21-Feb-06 17:15 To: Brian Blais Cc: pyt...@py...; num...@li...; = sci...@sc... Subject: Re: [Numpy-discussion] algorithm, optimization, or other = problem? Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais <bb...@br...> wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing = neural > simulations. This application is definitely computing-time limited, = and I need to > optimize at least one inner loop of the code, or perhaps even rethink = the algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I = have it as an > array, and I choose columns from that array randomly. in other cases, = I may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and = a weight > vector, "w", so > > y=3Ddot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=3Dw+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is = amenable to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop = steps, takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work = very fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code = as a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that = there is > something I am doing wrong, because the C-code for both should run = comparably. > Perhaps the approach is wrong? I'm willing to take any suggestions! = I don't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I = want to be able > to use the Enthought version for Windows. I develop on Linux, and = haven't had a > chance to see if I can compile numpy using the Enthought Python for = Windows. > > If there is anything else anyone needs to know, I'll post it. I put = the main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bb...@br... > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=3Drandom((100,1000)) # 1000 input vectors > > numpats=3Dx.shape[0] > w=3Drandom((numpats,1)); > > th=3Drandom((1,1)) > > params=3D{} > params['eta']=3D0.001; > params['tau']=3D100.0; > old_mx=3D0; > for e in range(100): > > rnd=3Drandint(0,numpats,250000) > t1=3Dtime.time() > if 0: # straight python > for i in range(len(rnd)): > pat=3Drnd[i] > xx=3Dreshape(x[:,pat],(1,-1)) > y=3Dmatrixmultiply(xx,w) > w=3Dw+params['eta']*(y*transpose(xx)-y**2*w); > th=3Dth+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > = #=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=3Dparams['eta'] # learning rate > tau=3Dparams['tau'] # used for variance estimate > > cdef double y > num_iterations=3Drnd.dimensions[0] > num_inputs=3Dw.dimensions[0] > > # get the pointers > wp=3D<double *>w.data > xp=3D<double *>X.data > rndp=3D<int *>rnd.data > thp=3D<double *>th.data > > for it from 0 <=3D it < num_iterations: > > offset=3Drndp[it]*num_inputs > > # calculate the output > y=3D0.0 > for i from 0 <=3D i < num_inputs: > y=3Dy+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <=3D i < num_inputs: > wp[i]=3Dwp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=3Dthp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log = files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD = SPLUNK! > = http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D= 121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log = files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=103432&bid#0486&dat=121642 _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
|
From: <sk...@po...> - 2006-02-22 03:58:08
|
Robert> Google suggests that it does matter. E.g.
Robert> http://mail.python.org/pipermail/python-dev/2001-March/013510.html
Robert> http://bugs.mysql.com/bug.php?id=14202
Robert> http://mail.python.org/pipermail/image-sig/2002-June/001884.html
*sigh* Thanks. You'd think that Solaris was a common enough platform that
the ATLAS folks would get this right...
Skip
|
|
From: Robert K. <rob...@gm...> - 2006-02-22 03:18:07
|
sk...@po... wrote: > Robert> Hmm. Was ATLAS compiled -fPIC? > > I'm not certain, but I doubt it should matter since only .a files were > generated. There's nothing to relocate: > > $ ls -ltr > total 9190 > lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe > -rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile > -rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a > -rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a > -rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a > -rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a > -rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a Google suggests that it does matter. E.g. http://mail.python.org/pipermail/python-dev/2001-March/013510.html http://bugs.mysql.com/bug.php?id=14202 http://mail.python.org/pipermail/image-sig/2002-June/001884.html -- Robert Kern rob...@gm... "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter |
|
From: Zachary P. <zp...@st...> - 2006-02-22 03:14:32
|
numpy.linglg.singular_value_decomposition is defined as follows:
def singular_value_decomposition(A, full_matrices=0):
return svd(A, 0)
Shouldn't that last line be
return svd(A, full_matrices)
Zach
|
|
From: <sk...@po...> - 2006-02-22 02:18:05
|
Robert> Hmm. Was ATLAS compiled -fPIC?
I'm not certain, but I doubt it should matter since only .a files were
generated. There's nothing to relocate:
$ ls -ltr
total 9190
lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe
-rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile
-rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a
-rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a
-rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a
-rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a
-rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a
Robert> You get messages like this when something previous goes
Robert> wrong.
Thanks. Now I know to focus on only the first problem...
Skip
|
|
From: Sasha <nd...@ma...> - 2006-02-22 01:05:58
|
[I am reposting this under a different subject because my original
post got buried in a long thread that went on to discussing unrelated
topics. Sorry if you had to read this post twice.]
It turns out that around (or round_) is implemented in python:
def round_(a, decimals=3D0):
"""Round 'a' to the given number of decimal places. Rounding
behaviour is equivalent to Python.
Return 'a' if the array is not floating point. Round both the real
and imaginary parts separately if the array is complex.
"""
a =3D asarray(a)
if not issubclass(a.dtype.type, _nx.inexact):
return a
if issubclass(a.dtype.type, _nx.complexfloating):
return round_(a.real, decimals) + 1j*round_(a.imag, decimals)
if decimals is not 0:
decimals =3D asarray(decimals)
s =3D sign(a)
if decimals is not 0:
a =3D absolute(multiply(a, 10.**decimals))
else:
a =3D absolute(a)
rem =3D a-asarray(a).astype(_nx.intp)
a =3D _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a))
# convert back
if decimals is not 0:
return multiply(a, s/(10.**decimals))
else:
return multiply(a, s)
I see many ways to improve the performance here. First, there is no
need to check for "decimals is not 0" three times. This can be done
once, maybe at the expense of some code duplication. Second,
_nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be
equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a
ufunc as Mads originally suggested, "decimals is 0" branch can just
call that.
It is tempting to rewrite the whole thing in C, but before I do that
I have a few questions about current implementation.
1. It is implemented in oldnumeric.py . Does this mean it is
deprecated. If so, what is the recommended replacement?
2. Was it intended to support array and fractional values for
decimals or is it an implementation artifact. Currently:
>>> around(array([1.2345]*5),[1,2,3,4,5])
array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345])
>>> around(1.2345,2.5)
array(1.2332882874656679)
3. It does nothing to exact types, even if decimals<0
>>> around(1234, -2)
array(1234)
Is this a bug? Consider that
>>> round(1234, -2)
1200.0
and
>>> around(1234., -2)
array(1200.0)
Docstring is self-contradictory: "Rounding behaviour is equivalent to
Python" is not consistent with "Return 'a' if the array is not
floating point."
I propose to deprecate around and implement a new "round" member
function in C that will only accept scalar "decimals" and will behave
like a properly vectorized builtin round. I will do the coding if
there is interest.
In any case, something has to be done here. I don't think the
following timings are acceptable:
> python -m timeit -s "from numpy import array; x =3D array([1.5]*1000)" "(=
x+0.5).astype(int).astype(float)"
100000 loops, best of 3: 18.8 usec per loop
> python -m timeit -s "from numpy import array, around; x =3D array([1.5]*1=
000)" "around(x)"
10000 loops, best of 3: 155 usec per loop
|
|
From: Christopher B. <Chr...@no...> - 2006-02-22 00:55:01
|
Sven Schreiber wrote:
>>>>> a = N.ones((5,10))
>>>>> a[:,1].shape # an index: it reduces the rank
>> (5,)
>>>>> a[:,1:2].shape # a slice: it keeps the rank
>> (5, 1)
>>
> That's very interesting, thanks. But I find it a little
> unintuitive/surprising, so I'm not sure if I will use it. I fear that I
> wouldn't understand my own code after a while of not working on it.
Well, what's surprising to different people is different. However....
> I guess I'd rather follow the advice and just remember to treat 1d as a row.
Except that it's not, universally. For instance, it won't transpose:
>>> a = N.ones((5,))
>>> a.transpose()
array([1, 1, 1, 1, 1])
>>> a.shape = (1,-1)
>>> a
array([[1, 1, 1, 1, 1]])
>>> a.transpose()
array([[1],
[1],
[1],
[1],
[1]])
so while a rank-1 array is often treated like a row vector, it really
isn't the same. The concept of a row vs a column vector is a rank-2
array concept -- so keep your arrays rank-2.
It's very helpful to remember that indexing reduces rank, and slicing
keeps the rank the same. It will serve you well to use that in the
future anyway.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chr...@no...
|
|
From: Christopher B. <Chr...@no...> - 2006-02-22 00:46:07
|
> r2 = (dx*dx + dy*dy)
Might numpy.hypot() help here?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chr...@no...
|
|
From: Sven S. <sve...@gm...> - 2006-02-21 23:56:46
|
Christopher Barker schrieb: > > and you can easily get a column vector out of an array, if you remember > that you want to keep it 2-d. i.e. use a slice rather than an index: > >>>> import numpy as N >>>> a = N.ones((5,10)) >>>> a[:,1].shape # an index: it reduces the rank > (5,) >>>> a[:,1:2].shape # a slice: it keeps the rank > (5, 1) > That's very interesting, thanks. But I find it a little unintuitive/surprising, so I'm not sure if I will use it. I fear that I wouldn't understand my own code after a while of not working on it. I guess I'd rather follow the advice and just remember to treat 1d as a row. But thanks alot, sven |
|
From: Sven S. <sve...@gm...> - 2006-02-21 23:48:41
|
Travis Oliphant schrieb: > Sven Schreiber wrote: > >> Next, I try to workaround by b.squeeze(). That seems to work, but why is >> b.squeeze().shape == (1, 112) instead of (112,)? >> >> > Again the same reason as before. A matrix is returned from b.squeeze() > and there are no 1-d matrices. Thus, you get a row-vector. Use .T if > you want a column vector. Well if squeeze can't really squeeze matrix-vectors (doing a de-facto transpose instead), wouldn't it make more sense to disable the squeeze method for matrices altogether? > >> Then I thought maybe b.flattened() does the job, but then I get an error >> (matrix has no attr flattened). Again, I'm baffled. >> >> > The correct spelling is b.flatten() Ok, but I copied .flattened() from p. 48 of your book, must be a typo then. > You can mix arrays and matrices just fine if you remember that 1d arrays > are equivalent to row-vectors. > -Travis > Ok, thanks. Btw, did the recent numpy release change anything in terms of preserving matrix types when passing to decompositions etc? I checked the release notes but maybe they're just not verbose enough. -Sven |
|
From: <co...@ph...> - 2006-02-21 23:42:45
|
Tim Hochberg <tim...@co...> writes:
> David M. Cooke wrote:
>
>>On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote:
>>
>>
>>>OK, I now have a faily clean implementation in C of:
>>>
>>>def __pow__(self, p):
>>> if p is not a scalar:
>>> return power(self, p)
>>> elif p == 1:
>>> return p
>>> elif p == 2:
>>> return square(self)
>>># elif p == 3:
>>># return cube(self)
>>># elif p == 4:
>>># return power_4(self)
>>># elif p == 0:
>>># return ones(self.shape, dtype=self.dtype)
>>># elif p == -1:
>>># return 1.0/self
>>> elif p == 0.5:
>>> return sqrt(self)
I've gone through your code you checked in, and fixed it up. Looks
good. One side effect is that
def zl(x):
a = ones_like(x)
a[:] = 0
return a
is now faster than zeros_like(x) :-)
One problem I had is that in PyArray_SetNumericOps, the "copy" method
wasn't picked up on. It may be due to the order of initialization of
the ndarray type, or something (since "copy" isn't a ufunc, it's
initialized in a different place). I couldn't figure out how to fiddle
that, so I replaced the x.copy() call with a call to PyArray_Copy().
>>Yes; because it's the implementation of __pow__, the second argument can
>>be anything.
>>
>>
> No, you misunderstand.. What I was talking about was that the *first*
> argument can also be something that's not a PyArrayObject, despite the
> functions signature.
Ah, I suppose that's because the power slot in the number protocol
also handles __rpow__.
>>> On the other hand, real powers are fast enough that doing anything
>>> at the single element level is unlikely to help. So in that case
>>> we're left with either optimizing the cases where the dimension is
>>> zero as David has done, or optimizing at the __pow__ (AKA
>>> array_power) level as I've done now based on David's original
>>> suggestion. This second approach is faster because it avoids the
>>> mysterious python scalar -> zero-D array conversion overhead.
>>> However, it suffers if we want to optimize lots of different powers
>>> since one needs a ufunc for each one. So the question becomes,
>>> which powers should we optimize?
>>
>>Hmm, ufuncs are passed a void* argument for passing info to them. Now,
>>what that argument is defined when the ufunc is created, but maybe
>>there's a way to piggy-back on it.
>>
>>
> Yeah, I really felt like I was fighting the ufuncs when I was playing
> with this. On the one hand, you really want to use the ufunc
> machinery. On the other hand that forces you into using the same types
> for both arguments. That really wouldn't be a problem, since we could
> just define an integer_power that took doubles, but did integer
> powers, except for the conversion overhead of Python_Integers into
> arrays. It looks like you started down this road and I played with
> this as well. I can think a of at least one (horrible) way around
> the matrix overhead, but the real fix would be to dig into
> PyArray_EnsureArray and see why it's slow for Python_Ints. It is much
> faster for numarray scalars.
Right; that needs to be looked at.
> Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at
> the level of array_power. I think I could make this work. It would
> probably work well for medium size arrays, but might well make things
> worse for large arrays that are limited by memory bandwidth since it
> would need to move the array from memory into the cache multiple times.
I don't like that; I think it would be better memory-wise to do it
elementwise. Not just speed, but size of intermediate arrays.
>>> My latest thinking on this is that we should optimize only those
>>> cases where the optimized result is no less accurate than that
>>> produced by pow. I'm going to assume that all C operations are
>>> equivalently accurate, so pow(x,2) has roughly the same amount of
>>> error as x*x. (Something on the order of 0.5 ULP I'd guess). In
>>> that case:
>>> pow(x, -1) -> 1 / x
>>> pow(x, 0) -> 1
>>> pow(x, 0.5) -> sqrt(x)
>>> pow(x, 1) -> x
>>> pow(x, 2) -> x*x
>>> can all be implemented in terms of multiply or divide with the same
>>> accuracy as the original power methods. Once we get beyond these,
>>> the error will go up progressively.
>>>
>>> The minimal set described above seems like it should be relatively
>>> uncontroversial and it's what I favor. Once we get beyond this
>>> basic set, we would need to reach some sort of consensus on how
>>> much additional error we are willing to tolerate for optimizing
>>> these extra cases. You'll notice that I've changed my mind, yet
>>> again, over whether to optimize A**0.5. Since the set of additional
>>> ufuncs needed in this case is relatively small, just square and
>>> inverse (==1/x), this minimal set works well if optimizing in pow
>>> as I've done.
>>>
>>>
>
> Just to add a little more confusion to the mix. I did a little testing
> to see how close pow(x,n) and x*x*... actually are. They are slightly
> less close for small values of N and slightly closer for large values
> of N than I would have expected. The upshot of this is that integer
> powers between -2 and +4 all seem to vary by the same amount when
> computed using pow(x,n) versus multiplies. I'm including the test code
> at the end. Assuming that this result is not a fluke that expands the
> noncontroversial set by at least 3 more values. That's starting to
> strain the ufunc aproach, so perhaps optimizing in @TYP@_power is the
> way to go after all. Or, more likely, adding @TYP@_int_power or maybe
> @TYP@_fast_power (so as to be able to include some half integer
> powers) and dispatching appropriately from array_power.
'int_power' we could do; that would the next step I think. The half
integer powers we could maybe leave; if you want x**(-3/2), for
instance, you could do y = x**(-1)*sqrt(x) (or do y = x**(-1);
sqrt(y,y) if you're worried about temporaries).
Or, 'fast_power' could be documented as doing the optimizations for
integer and half-integer _scalar_ exponents, up to a certain size,
like 100), and falling back on pow() if necessary. I think we could do
a precomputation step to split the exponent into appropiate squarings
and such that'll make the elementwise loop faster. Half-integer
exponents are exactly representable as doubles (up to some number of
course), so there's no chance of decimal-to-binary conversions making
things look different. That might work out ok. Although, at that point
I'd suggest we make it 'power', and have 'rawpower' (or ????) as the
version that just uses pow().
Another point is to look at __div__, and use reciprocal if the
dividend is 1.
> The problem here, of course, is the overhead that PyArray_EnsureArray
> runs into. I'm not sure if the ufuncs actually call that, but I was
> using that to convert things to arrays at one point and I saw the
> slowdown, so I suspect that the slowdown is in something
> PyArray_EnsureArray calls if not in that routine itself. I'm afraid to
> dig into that stuff though.. On the other hand, it would probably
> speed up all kinds of stuff if that was sped up.
I've added a page to the developer's wiki at
http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas
to keep a list of areas like that to look into if someone has time :-)
--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...
|
|
From: Gary R. <gr...@bi...> - 2006-02-21 23:33:47
|
Something like this would be great to see in scipy. Pity about the licence. Gary R. Alan G Isaac wrote: > On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: >> My system consists of N particles, whose coordinates in >> the xy-plane is given by the two vectors x and y. I need >> to calculate the distance between all particle pairs > > Of possible interest? > http://www.cs.umd.edu/~mount/ANN/ > > Cheers, > Alan Isaac |
|
From: Gary R. <gr...@bi...> - 2006-02-21 23:17:51
|
Thanks Stéfan, I find this much better now. However, I'd like to hear suggestions from others if they can think of ways of further improving the style since I see this as a template for future tutorials. I'll just note that ipython on my windows systems doesn't do the syntax colouring the same, so if I was to make a similarly styled tutorial, there would be some variation in colouring. I also think others would be likely to use the default >>> Python prompt. I don't think this minor variation in styles would detract from getting the information across, so I wouldn't advocate trying to lock authors into any particular style. Good work, Gary Stefan van der Walt wrote: > Hi Gary > > Thanks for your suggestions. I incorporated them. > > Stéfan |
|
From: Tim H. <tim...@co...> - 2006-02-21 22:23:41
|
Mads Ipsen wrote:
>On Tue, 21 Feb 2006, Tim Hochberg wrote:
>
>
>
>>This all makes perfect sense, but what happended to box? In your
>>original code there was a step where you did some mumbo jumbo and box
>>and rint. Namely:
>>
>>
>
>It's a minor detail, but the reason for this is the following
>
>Suppose you have a line with length of box = 10 with periodic boundary
>conditions (basically this is a circle). Now consider two points x0 = 1
>and x1 = 9 on this line. The shortest distance dx between the points x0
>and x1 is dx = -2 and not 8. The calculation
>
> dx = x1 - x0 ( = +8)
> dx -= box*rint(dx/box) ( = -2)
>
>will give you the desired result, namely dx = -2. Hope this makes better
>sense. Note that fmod() won't work since
>
> fmod(dx,box) = 8
>
>
I think you could use some variation like "fmod(dx+box/2, box) - box/2"
but rint seems better.
>Part of my original post was concerned with the fact, that I initially was
>using around() from numpy for this step. This was terribly slow, so I made
>some custom changes and added rint() from the C-math library to the numpy
>module, giving a speedup factor about 4 for this particular line in the
>code.
>
>Best regards // Mads
>
>
>
OK, that all makes sense. You might want to try the following, which
factors out all the divisions and half the multiplies by box and
produces several fewer temporaries. Note I replaced x**2 with x*x,
which for the moment is much faster (I don't know if you've been
following the endless yacking about optimizing x**n, but x**2 will get
fast eventually). Depending on what you're doing with r2, you may be
able to avoid the last multiple by box as well.
# Loop over all particles
xbox = x/box
ybox = y/box
for i in range(n-1):
dx = xbox[i+1:] - xbox[i]
dy = ybox[i+1:] - ybox[i]
dx -= rint(dx)
dy -= rint(dy)
r2 = (dx*dx + dy*dy)
r2 *= box
Regards,
-tim
|
|
From: Alexander B. <ale...@gm...> - 2006-02-21 22:23:12
|
It turns out that around (or round_) is implemented in python:
def round_(a, decimals=3D0):
"""Round 'a' to the given number of decimal places. Rounding
behaviour is equivalent to Python.
Return 'a' if the array is not floating point. Round both the real
and imaginary parts separately if the array is complex.
"""
a =3D asarray(a)
if not issubclass(a.dtype.type, _nx.inexact):
return a
if issubclass(a.dtype.type, _nx.complexfloating):
return round_(a.real, decimals) + 1j*round_(a.imag, decimals)
if decimals is not 0:
decimals =3D asarray(decimals)
s =3D sign(a)
if decimals is not 0:
a =3D absolute(multiply(a, 10.**decimals))
else:
a =3D absolute(a)
rem =3D a-asarray(a).astype(_nx.intp)
a =3D _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a))
# convert back
if decimals is not 0:
return multiply(a, s/(10.**decimals))
else:
return multiply(a, s)
I see many ways to improve the performance here. First, there is no
need to check for "decimals is not 0" three times. This can be done
once, maybe at the expense of some code duplication. Second,
_nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be
equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a
ufunc as Mads originally suggested, "decimals is 0" branch can just
call that.
It is tempting to rewrite the whole thing in C, but before I do that
I have a few questions about current implementation.
1. It is implemented in oldnumeric.py . Does this mean it is
deprecated. If so, what is the recommended replacement?
2. Was it intended to support array and fractional values for
decimals or is it an implementation artifact. Currently:
>>> around(array([1.2345]*5),[1,2,3,4,5])
array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345])
>>> around(1.2345,2.5)
array(1.2332882874656679)
3. It does nothing to exact types, even if decimals<0
>>> around(1234, -2)
array(1234)
Is this a bug? Consider that
>>> round(1234, -2)
1200.0
and
>>> around(1234., -2)
array(1200.0)
Docstring is self-contradictory: "Rounding behaviour is equivalent to
Python" is not consistent with "Return 'a' if the array is not
floating point."
I propose to deprecate around and implement a new "round" member
function in C that will only accept scalar "decimals" and will behave
like a properly vectorized builtin round. I will do the coding if
there is interest.
In any case, something has to be done here. I don't think the
following timings are acceptable:
> python -m timeit -s "from numpy import array; x =3D array([1.5]*1000)" "(=
x+0.5).astype(int).astype(float)"
100000 loops, best of 3: 18.8 usec per loop
> python -m timeit -s "from numpy import array, around; x =3D array([1.5]*1=
000)" "around(x)"
10000 loops, best of 3: 155 usec per loop
On 2/21/06, Sasha <nd...@ma...> wrote:
> On the second thought, the difference between around and astype is not
> surprising because around operates in terms of decimals. Rather than
> adding rint, I would suggest to make a special case decimals=3D0 use C
> rint.
>
> > On 2/21/06, Mads Ipsen <mp...@os...> wrote:
> > > I suggest that rint() is added as a ufunc or is there any concerns
> > > here that I am not aware of?
> >
> > You might want to use astype(int). On my system it is much faster than=
around:
> >
> > > python -m timeit -s "from numpy import array, around; x =3D array([1.=
5]*1000)" "around(x)"
> > 10000 loops, best of 3: 176 usec per loop
> > > python -m timeit -s "from numpy import array, around; x =3D array([1.=
5]*1000)" "x.astype(int)"
> > 100000 loops, best of 3: 3.2 usec per loop
> >
> > the difference is too big to be explained by the fact that around
> > allocates twice as much memory for the result. In fact the following
> > equivalent of rint is still very fast:
> >
> > > python -m timeit -s "from numpy import array, around; x =3D array([1.=
5]*1000)" "x.astype(int).astype(float)"
> > 100000 loops, best of 3: 6.48 usec per loop
> >
>
|
|
From: Robert K. <rob...@gm...> - 2006-02-21 22:09:47
|
sk...@po... wrote:
> After a brief hiatus I'm back to trying to build numpy. Last time I checked
> in (on the scipy list) I had successfully built ATLAS and created this
> simple site.cfg file in .../numpy/distutils/site.cfg:
>
> [atlas]
> library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe
> include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe
> # for overriding the names of the atlas libraries
> atlas_libs = lapack, f77blas, cblas, atlas
>
> I svn up'd (now at rev 2138), zapped my build directory, then executed
> "python setup.py build". Just in case it matters, I'm using Python 2.4.2
> built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt:
>
> Running from numpy source directory.
> No module named __svn_version__
> F2PY Version 2_2138
> blas_opt_info:
> blas_mkl_info:
> /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[]
> warnings.warn("Library error: libs=%s found_libs=%s" % \
> NOT AVAILABLE
>
> atlas_blas_threads_info:
> Setting PTATLAS=ATLAS
> /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[]
> warnings.warn("Library error: libs=%s found_libs=%s" % \
> Setting PTATLAS=ATLAS
> Setting PTATLAS=ATLAS
> FOUND:
> libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
> library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe']
> language = c
> include_dirs = ['/opt/include']
> ...
>
> See my site.cfg file? Why does it affect library_dirs but not include_dirs?
Probably a bug, but I don't see exactly where at the moment. It shouldn't really
affect anything, I don't think. The only header file that comes with ATLAS is
cblas.h, and I'm pretty sure that numpy itself doesn't need it. In fact, we
provide our own copy in numpy/core/blasdot/ for the parts that can use it.
> running build_src
> building extension "atlas_version" sources
> adding 'build/src/atlas_version_0x33c6fa32.c' to sources.
> running build_ext
> customize UnixCCompiler
> customize UnixCCompiler using build_ext
> building 'atlas_version' extension
> compiling C sources
> gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC'
> compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c'
> /opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so
> Text relocation remains referenced
> against symbol offset in file
> <unknown> 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> <unknown> 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> ... bunch of missing <unknown>s elided ...
> printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> ... what's this? can't find printf??? ...
> ld: fatal: relocations remain against allocatable but non-writable sections
> collect2: ld returned 1 exit status
Hmm. Was ATLAS compiled -fPIC? I'm afraid I'm a little out of my depth when it
comes to linking shared objects on Solaris.
> Text relocation remains referenced
> against symbol offset in file
> <unknown> 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> ... more eliding ...
> printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
> ld: fatal: relocations remain against allocatable but non-writable sections
> collect2: ld returned 1 exit status
> ##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1
> error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1
> FOUND:
> libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
> library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe']
> language = c
> define_macros = [('NO_ATLAS_INFO', 2)]
> include_dirs = ['/opt/include']
>
> Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas
> ...
>
> How can I initialize things earlier? Does it matter?
You get messages like this when something previous goes wrong. There's nothing
you can do to initialize things earlier except to make sure that the previous
steps don't fail. It's not the most informative error message, I know.
> Traceback (most recent call last):
> File "setup.py", line 76, in ?
> setup_package()
> File "setup.py", line 63, in setup_package
> config.add_subpackage('numpy')
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage
> config_list = self.get_subpackage(subpackage_name,subpackage_path)
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage
> subpackage_path)
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py
> config = setup_module.configuration(*args)
> File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration
> config.add_subpackage('core')
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage
> config_list = self.get_subpackage(subpackage_name,subpackage_path)
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage
> subpackage_path)
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py
> config = setup_module.configuration(*args)
> File "numpy/core/setup.py", line 215, in configuration
> config.add_data_dir('tests')
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir
> self.add_data_files((ds,filenames))
> File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files
> dist.data_files.extend(data_dict.items())
> AttributeError: 'NoneType' object has no attribute 'extend'
>
> And finally, a traceback. What's up with that?
Essentially, the same issue here. Since an earlier step failed, dist.data_files
is still None.
--
Robert Kern
rob...@gm...
"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
|