From: Nick F. <nv...@MI...> - 2006-07-14 00:08:38
|
Dear all, I often make use of numpy.vectorize to make programs read more like the physics equations I write on paper. numpy.vectorize is basically a wrapper for numpy.frompyfunc. Reading Travis's Scipy Book (mine is dated Jan 6 2005) kind of suggests to me that it returns a full- fledged ufunc exactly like built-in ufuncs. First, is this true? Second, how is the performance? i.e., are my functions performing approximately as fast as they could be or would they still gain a great deal of speed by rewriting it in C or some other compiled python accelerator? As an aside, I've found the following function decorator to be helpful for readability, and perhaps others will enjoy it or improve upon it: def autovectorized(f): """Function decorator to do vectorization only as necessary. vectorized functions fail for scalar inputs.""" def wrapper(input): if type(input) == numpy.arraytype: return numpy.vectorize(f)(input) return f(input) return wrapper For those unfamiliar to the syntactic joys of Python 2.4, you can then use this as: @autovectorized def myOtherwiseScalarFunction(*args): ... and now the function will work with both numpy arrays and scalars. Take care, Nick |
From: Nick F. <nv...@MI...> - 2006-07-14 16:44:00
|
On Jul 13, 2006, at 10:17 PM, Tim Hochberg wrote: > Nick Fotopoulos wrote: >> Dear all, >> >> I often make use of numpy.vectorize to make programs read more >> like the physics equations I write on paper. numpy.vectorize is >> basically a wrapper for numpy.frompyfunc. Reading Travis's Scipy >> Book (mine is dated Jan 6 2005) kind of suggests to me that it >> returns a full- fledged ufunc exactly like built-in ufuncs. >> >> First, is this true? > Well according to type(), the result of frompyfunc is indeed of > type ufunc, so I would say the answer to that is yes. >> Second, how is the performance? > A little timing indicates that it's not good (about 30 X slower for > computing x**2 than doing it using x*x on an array). . That's not > frompyfunc (or vectorizes) fault though. It's calling a python > function at each point, so the python function call overhead is > going to kill you. Not to mention instantiating an actual Python > object or objects at each point. That's unfortunate since I tend to nest functions quite deeply and then scipy.integrate.quad over them, which I'm sure results in a ridiculous number of function calls. Are anonymous lambdas any different than named functions in terms of performance? > >> i.e., are my functions performing approximately as fast as they >> could be or would they still gain a great deal of speed by >> rewriting it in C or some other compiled python accelerator? >> > Can you give examples of what these functions look like? You might > gain a great deal of speed by rewriting them in numpy in the > correct way. Or perhaps not, but it's probably worth showing some > examples so we can offer suggestions or at least admit that we are > stumped. This is by far the slowest bit of my code. I cache the results, so it's not too bad, but any upstream tweak can take a lot of CPU time to propagate. @autovectorized def dnsratezfunc(z): """Take coalescence time into account."" def integrand(zf): return Pz(z,zf)*NSbirthzfunc(zf) return quad(integrand,delayedZ(2e5*secperyear+1,z),5)[0] dnsratez = lambdap*dnsratezfunc(zs) where: # Neutron star formation rate is a delayed version of star formation rate NSbirthzfunc = autovectorized(lambda z: SFRz(delayedZ (1e8*secperyear,z))) def Pz(z_c,z_f): """Return the probability density per unit redshift of a DNS coalescence at z_c given a progenitor formation at z_f. """ return P(t(z_c,z_f))*dtdz(z_c) and there are many further nested levels of function calls. If the act of calling a function is more expensive than actually executing it and I value speed over readability/code reuse, I can inline Pz's function calls and inline the unvectorized NSbirthzfunc to reduce the calling stack a bit. Any other suggestions? Thanks, Tim. Take care, Nick |
From: David H. <dav...@gm...> - 2006-07-16 01:38:19
|
2006/7/14, Nick Fotopoulos <nv...@mi...>: > > Any other suggestions? > Hi Nick, I had some success by coding the integrand in fortran and wrapping it with f2py. If your probability density function is standard, you may find it in the flib library of the PyMC module of Chris Fonnesbeck ( a library of likelihood functions coded in f77) and save the trouble. Hope this helps, David |
From: Tim H. <tim...@co...> - 2006-07-14 16:56:48
|
Nick Fotopoulos wrote: > On Jul 13, 2006, at 10:17 PM, Tim Hochberg wrote: > > >> Nick Fotopoulos wrote: >> >>> Dear all, >>> >>> I often make use of numpy.vectorize to make programs read more >>> like the physics equations I write on paper. numpy.vectorize is >>> basically a wrapper for numpy.frompyfunc. Reading Travis's Scipy >>> Book (mine is dated Jan 6 2005) kind of suggests to me that it >>> returns a full- fledged ufunc exactly like built-in ufuncs. >>> >>> First, is this true? >>> >> Well according to type(), the result of frompyfunc is indeed of >> type ufunc, so I would say the answer to that is yes. >> >>> Second, how is the performance? >>> >> A little timing indicates that it's not good (about 30 X slower for >> computing x**2 than doing it using x*x on an array). . That's not >> frompyfunc (or vectorizes) fault though. It's calling a python >> function at each point, so the python function call overhead is >> going to kill you. Not to mention instantiating an actual Python >> object or objects at each point. >> > > That's unfortunate since I tend to nest functions quite deeply and > then scipy.integrate.quad over them, which I'm sure results in a > ridiculous number of function calls. Are anonymous lambdas any > different than named functions in terms of performance? > Sorry, no. Under the covers they're the same. >>> i.e., are my functions performing approximately as fast as they >>> could be or would they still gain a great deal of speed by >>> rewriting it in C or some other compiled python accelerator? >>> >>> >> Can you give examples of what these functions look like? You might >> gain a great deal of speed by rewriting them in numpy in the >> correct way. Or perhaps not, but it's probably worth showing some >> examples so we can offer suggestions or at least admit that we are >> stumped. >> > > This is by far the slowest bit of my code. I cache the results, so > it's not too bad, but any upstream tweak can take a lot of CPU time > to propagate. > > @autovectorized > def dnsratezfunc(z): > """Take coalescence time into account."" > def integrand(zf): > return Pz(z,zf)*NSbirthzfunc(zf) > return quad(integrand,delayedZ(2e5*secperyear+1,z),5)[0] > dnsratez = lambdap*dnsratezfunc(zs) > > where: > > # Neutron star formation rate is a delayed version of star formation > rate > NSbirthzfunc = autovectorized(lambda z: SFRz(delayedZ > (1e8*secperyear,z))) > > def Pz(z_c,z_f): > """Return the probability density per unit redshift of a DNS > coalescence at z_c given a progenitor formation at z_f. """ > return P(t(z_c,z_f))*dtdz(z_c) > > and there are many further nested levels of function calls. If the > act of calling a function is more expensive than actually executing > it and I value speed over readability/code reuse, I can inline Pz's > function calls and inline the unvectorized NSbirthzfunc to reduce the > calling stack a bit. Any other suggestions? > I think I'd try psyco (http://psyco.sourceforge.net/). That's pretty painless to try and may result in a significant improvement. -tim |
From: Nick F. <nv...@MI...> - 2006-07-14 17:39:21
|
On Jul 14, 2006, at 12:56 PM, Tim Hochberg wrote: <snip> > I think I'd try psyco (http://psyco.sourceforge.net/). That's > pretty painless to try and may result in a significant improvement. I've been doing more and more development on my PPC Mac, where psyco is not an option. If the speed issue really gets to me, I can run things with psyco on a linux box. Thanks, Nick |
From: Travis O. <oli...@ie...> - 2006-07-16 04:01:48
|
Nick Fotopoulos wrote: > Dear all, > > I often make use of numpy.vectorize to make programs read more like > the physics equations I write on paper. numpy.vectorize is basically > a wrapper for numpy.frompyfunc. Reading Travis's Scipy Book (mine is > dated Jan 6 2005) kind of suggests to me that it returns a full- > fledged ufunc exactly like built-in ufuncs. > > First, is this true? Yes, it is true. But, it is a ufunc on Python object data-types. It is calling the underlying Python function at every point in the loop. > Second, how is the performance? i.e., are my > functions performing approximately as fast as they could be or would > they still gain a great deal of speed by rewriting it in C or some > other compiled python accelerator? > Absolutely the functions could be made faster to avoid the call back into Python at each evaluation stage. I don't think it would be too hard to replace the function-call with something else that could be evaluated more quickly. But, this has not been done yet. > As an aside, I've found the following function decorator to be > helpful for readability, and perhaps others will enjoy it or improve > upon it: > Thanks for the decorator. This should be put on the www.scipy.org wiki. -Travis |
From: Nick F. <nv...@MI...> - 2006-07-18 03:31:03
|
On Jul 16, 2006, at 12:01 AM, Travis Oliphant wrote: > Thanks for the decorator. This should be put on the www.scipy.org > wiki. I've been looking over the wiki and am not sure where the best place would be for such a snippet. Would it go with the numpy examples under vectorize or perhaps in a cookbook somewhere? This seems more specialized than the basic numpy examples, but not worthy of its own cookbook. In general, what do you do with constructs that seem useful, but aren't useful enough to just include somewhere in NumPy/ SciPy? How would anyone think to look for a tip like this? Also, thanks for your helpful responses and additional thanks to Travis for the book update. Take care, Nick |
From: Gary R. <gr...@bi...> - 2006-07-18 10:30:51
|
Nick Fotopoulos wrote: > I've been looking over the wiki and am not sure where the best place > would be for such a snippet. Would it go with the numpy examples > under vectorize or perhaps in a cookbook somewhere? Yes. It seems to me like a cookbook example. In the utopian future, when there are as many cookbook examples as OReilly have, it'll be time for a reorganisation, but for now, make it a cookbook entry. Gary R |