You can subscribe to this list here.
| 2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
| 2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
| 2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
| 2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
| 2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
| 2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
|
From: Alan G I. <ai...@am...> - 2006-02-21 21:42:13
|
On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: > My system consists of N particles, whose coordinates in > the xy-plane is given by the two vectors x and y. I need > to calculate the distance between all particle pairs Of possible interest? http://www.cs.umd.edu/~mount/ANN/ Cheers, Alan Isaac |
|
From: Mads I. <mp...@os...> - 2006-02-21 21:04:31
|
On Tue, 21 Feb 2006, Tim Hochberg wrote: > Can you explain a little more about what you are trying to calculate? > The bit about subtracting off box*rint(dx/box) is a little odd. It > almost seems like you should be able to do something with fmod, but I > admit that I'm not sure how. > > If I had to guess as to source of the relative slowness I'd say it's > because you are creating a lot of temporary matrices. There are ways to > avoid this, but when taken to the extreme, they make your code look > ugly. You might try the following, untested, code or some variation and > see if it speeds things up. This makes extensive use of the little known > optional destination argument for ufuncs. I only tend to do this sort of > stuff where it's very critical since, as you can see, it makes things > quite ugly. > > dx_space = x.copy() > dy_space = y.copy() > scratch_space = x.copy() > for i in range(n-1): > dx = dx_space[i+1:] > dy = dy_space[i+1:] > scratch = scratch_space[i+1:] > subtract(x[i+1:], x[i], dx) > subtract(y[i+1:], y[i], dy) > # dx -= box*rint(dx/box) > divide(dx, box, scratch) > rint(scratch, scratch) > scratch *= box > dx -= scratch > # dy -= box*rint(dy/box) > divide(dy, box, scratch) > rint(scratch, scratch) > scratch *= box > dy -= scratch > r2 = dx**2 + dy**2 # square of dist. between points > > > > Hope that helps: > > -tim Here's what I am trying to do: My system consists of N particles, whose coordinates in the xy-plane is given by the two vectors x and y. I need to calculate the distance between all particle pairs, which goes like this: I pick particle 1 and calculate its distance to the N-1 other points. Then I pick particle 2. Since its distance to particle 1 was found in the previuos step, I only have to find its distance to the N-2 remaining points. In the i'th step, I therefore only have to consider particle i+1 up to particle N. That explains the loop structure, where dx = x[i+1:] - x[i] dy = y[i+1:] - y[i] the resulting vectors dx and dy will contain the x-distances from x[i] to the proceeding points from i+1 up to N. The square of the distance r2 is the given by r2 = dx**2 + dy**2 Another approach would be to use dx = subtract.outer(x,x) dy = subtract.outer(y,y) but that will be overkill, since all distances are counted twice, and also, the storage requirements grow rapidly if you have more than 1000 particles (app. 10^6 particle pairs). Thanks for your code feedback, which I'll have a closer look at. But I try to believe, that numpy/Numeric/Python was invented with the one purpose of avoiding coding like this - I think this is also a point you already made. But thanks again. // Mads |
|
From: Christopher B. <Chr...@no...> - 2006-02-21 20:48:18
|
Travis Oliphant wrote:
> You can mix arrays and matrices just fine if you remember that 1d arrays
> are equivalent to row-vectors.
and you can easily get a column vector out of an array, if you remember
that you want to keep it 2-d. i.e. use a slice rather than an index:
>>> import numpy as N
>>> a = N.ones((5,10))
>>> a[:,1].shape # an index: it reduces the rank
(5,)
>>> a[:,1:2].shape # a slice: it keeps the rank
(5, 1)
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chr...@no...
|
|
From: <sk...@po...> - 2006-02-21 20:20:00
|
After a brief hiatus I'm back to trying to build numpy. Last time I checked
in (on the scipy list) I had successfully built ATLAS and created this
simple site.cfg file in .../numpy/distutils/site.cfg:
[atlas]
library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe
include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe
# for overriding the names of the atlas libraries
atlas_libs = lapack, f77blas, cblas, atlas
I svn up'd (now at rev 2138), zapped my build directory, then executed
"python setup.py build". Just in case it matters, I'm using Python 2.4.2
built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt:
Running from numpy source directory.
No module named __svn_version__
F2PY Version 2_2138
blas_opt_info:
blas_mkl_info:
/home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[]
warnings.warn("Library error: libs=%s found_libs=%s" % \
NOT AVAILABLE
atlas_blas_threads_info:
Setting PTATLAS=ATLAS
/home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[]
warnings.warn("Library error: libs=%s found_libs=%s" % \
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
FOUND:
libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe']
language = c
include_dirs = ['/opt/include']
...
See my site.cfg file? Why does it affect library_dirs but not include_dirs?
running build_src
building extension "atlas_version" sources
adding 'build/src/atlas_version_0x33c6fa32.c' to sources.
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
building 'atlas_version' extension
compiling C sources
gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC'
compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c'
/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so
Text relocation remains referenced
against symbol offset in file
<unknown> 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
<unknown> 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
... bunch of missing <unknown>s elided ...
printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
... what's this? can't find printf??? ...
ld: fatal: relocations remain against allocatable but non-writable sections
collect2: ld returned 1 exit status
Text relocation remains referenced
against symbol offset in file
<unknown> 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
... more eliding ...
printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o)
ld: fatal: relocations remain against allocatable but non-writable sections
collect2: ld returned 1 exit status
##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1
error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1
FOUND:
libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe']
language = c
define_macros = [('NO_ATLAS_INFO', 2)]
include_dirs = ['/opt/include']
Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas
...
How can I initialize things earlier? Does it matter?
Traceback (most recent call last):
File "setup.py", line 76, in ?
setup_package()
File "setup.py", line 63, in setup_package
config.add_subpackage('numpy')
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage
config_list = self.get_subpackage(subpackage_name,subpackage_path)
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage
subpackage_path)
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py
config = setup_module.configuration(*args)
File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration
config.add_subpackage('core')
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage
config_list = self.get_subpackage(subpackage_name,subpackage_path)
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage
subpackage_path)
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py
config = setup_module.configuration(*args)
File "numpy/core/setup.py", line 215, in configuration
config.add_data_dir('tests')
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir
self.add_data_files((ds,filenames))
File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files
dist.data_files.extend(data_dict.items())
AttributeError: 'NoneType' object has no attribute 'extend'
And finally, a traceback. What's up with that?
In parallel with trying to build with ATLAS I'm also trying Travis's
suggestion of explicitly setting PTATLAS, ATLAS and BLAS to "None". Numpy
builds when I do that.
--
Skip Montanaro - sk...@po...
"The values to which people cling most stubbornly under inappropriate
conditions are those values that were previously the source of their
greatest triumphs over adversity." -- Jared Diamond in "Collapse"
|
|
From: Travis O. <oli...@ie...> - 2006-02-21 19:12:47
|
Sven Schreiber wrote: >Hi, sometimes I'm still struggling with peculiarities of numpy-arrays >vs. numpy-matrices; my latest story goes like this: > >I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I >can just manage to understand the resulting shape ( == (112,) ). > >Then I slice a column from a numpy-matrix b = somematrix[:,1] and get >the expected (112,1) shape. > >Then I do what I thought was the easiest thing in the world, I subtract >the two vectors: c = a - b >I was very surprised by the bug that showed up due to the fact that >c.shape == (112,112) !! > > As you know this isn't a bug, but very expected behavior. I don't see this changing any time soon. Arrays are different than matrices. Matrices are always 2-d arrays while arrays can have any number of dimensions. The default relationship between arrays and matrices is that 1-d arrays get converted to row-matrices (1,N). Regardless of which convention is chosen somebody will be bitten by that conversion if they think in terms of the other default. I don't see a way around that except to be careful when you mix arrays and matrices. >Next, I try to workaround by b.squeeze(). That seems to work, but why is >b.squeeze().shape == (1, 112) instead of (112,)? > > Again the same reason as before. A matrix is returned from b.squeeze() and there are no 1-d matrices. Thus, you get a row-vector. Use .T if you want a column vector. >Then I thought maybe b.flattened() does the job, but then I get an error >(matrix has no attr flattened). Again, I'm baffled. > > The correct spelling is b.flatten() And again you are going to get a (1,N) matrix out because of how 1d arrays are interpreted as matrices. In short, there is no way to get a 1-d matrix because that doesn't make sense. You can get a 1-d array using b.A.squeeze() >Second (preliminary) conclusion: I will paranoically use even more >asmatrix()-conversions in my code to avoid dealing with those >array-beasts ;-) and get column vectors I can trust... > > > >Is there a better general advice than to say: "numpy-matrices and >numpy-arrays are best kept in separated worlds" ? > > You can mix arrays and matrices just fine if you remember that 1d arrays are equivalent to row-vectors. -Travis |
|
From: Tim H. <tim...@co...> - 2006-02-21 19:11:38
|
Mads Ipsen wrote:
>On Tue, 21 Feb 2006, Zachary Pincus wrote:
>
>
>
>>Mads,
>>
>>The game with numpy, just as it is with Matlab or any other
>>interpreted numeric environment, is to try push as much of the
>>looping down into the C code as you can. This is because, as you now
>>know, compiled C can loop much faster than interpreted python.
>>
>>A simple example for averaging 1000 (x,y,z) points:
>>
>>print data.shape
>>(1000, 3)
>># bad: explicit for loop in python
>>avg = numpy.zeros(3, numpy.float_)
>>for i in data: avg += i
>>avg /= 1000.0
>>
>># good: implicit for loop in C
>>avg = numpy.add.reduce(data, axis = 0)
>>avg /= 1000.0
>>
>>In your case, instead of explicitly looping through each point, why
>>not do the calculations in parallel, operating on entire vectors of
>>points at one time? Then the looping is "pushed down" into compiled C
>>code. Or if you're really lucky, it's pushed all the way down to the
>>vector math units on your cpu if you have a good BLAS or whatever
>>installed.
>>
>>Zach Pincus
>>
>>Program in Biomedical Informatics and Department of Biochemistry
>>Stanford University School of Medicine
>>
>>
>>
>>
>>
>>>2. Here is the main loop for finding all possible pair distances,
>>> which corresponds to a loop over the upper triangular part of a
>>> square matrix
>>>
>>> # Loop over all particles
>>> for i in range(n-1):
>>> dx = x[i+1:] - x[i]
>>> dy = y[i+1:] - y[i]
>>>
>>> dx -= box*rint(dx/box)
>>> dy -= box*rint(dy/box)
>>>
>>> r2 = dx**2 + dy**2 # square of dist. between points
>>>
>>>where x and y contain the positions of the particles. A naive
>>>implementation in C is
>>>
>>>
>>> // loop over all particles
>>> for (int i=0; i<n-1; i++) {
>>> for (int j=i+1; j<n; j++) {
>>> dx = x[j] - x[i];
>>> dy = y[j] - y[i];
>>>
>>> dx -= box*rint(dx/box);
>>> dy -= box*rint(dy/box);
>>>
>>> r2 = dx*dx + dy*dy;
>>> }
>>> }
>>>
>>>For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is
>>>app. 10 times faster than the Python/Numeric counterpart. This is of
>>>course not satisfactory.
>>>
>>>Are there any things I am doing completely wrong here, basic
>>>approaches completely misunderstood, misuses etc?
>>>
>>>Any suggestions, guidelines, hints are most welcome.
>>>
>>>Best regards,
>>>
>>>Mads Ipsen
>>>
>>>
>>>+---------------------------------+-------------------------+
>>>| Mads Ipsen | |
>>>| Dept. of Chemistry | phone: +45-35320220 |
>>>| H.C.Ørsted Institute | fax: +45-35320322 |
>>>| Universitetsparken 5 | |
>>>| DK-2100 Copenhagen Ø, Denmark | mp...@os... |
>>>+---------------------------------+-------------------------+
>>>
>>>
>>>-------------------------------------------------------
>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through
>>>log files
>>>for problems? Stop! Download the new AJAX search engine that makes
>>>searching your log files as easy as surfing the web. DOWNLOAD
>>>SPLUNK!
>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
>>>_______________________________________________
>>>Numpy-discussion mailing list
>>>Nump
>>>
>>>
>
>I agree completely with your comments. But, as you can, the innermost part
>of the loop has been removed in the code, and replaced with numpy slices.
>It's hard for me to see how to compress the outer loop as well, since it
>determines the ranges for the inner loop. Unless there is some fancy slice
>notation, that allows you to loop over a triangular part of a matrix, ie.
>
> x[i] = sum(A[i+1,i])
>
>meaning x[i] = sum of elements in i'th row of A using only elements from
>position i+1 up to n.
>
>Of course, there is the possibility of hardcoding this in C and then make
>it available as a Python module. But I don't want to do this before I am
>sure there isn't a numpy way out this.
>
>Let me know, if you have any suggestions.
>
Can you explain a little more about what you are trying to calculate?
The bit about subtracting off box*rint(dx/box) is a little odd. It
almost seems like you should be able to do something with fmod, but I
admit that I'm not sure how.
If I had to guess as to source of the relative slowness I'd say it's
because you are creating a lot of temporary matrices. There are ways to
avoid this, but when taken to the extreme, they make your code look
ugly. You might try the following, untested, code or some variation and
see if it speeds things up. This makes extensive use of the little known
optional destination argument for ufuncs. I only tend to do this sort of
stuff where it's very critical since, as you can see, it makes things
quite ugly.
dx_space = x.copy()
dy_space = y.copy()
scratch_space = x.copy()
for i in range(n-1):
dx = dx_space[i+1:]
dy = dy_space[i+1:]
scratch = scratch_space[i+1:]
subtract(x[i+1:], x[i], dx)
subtract(y[i+1:], y[i], dy)
# dx -= box*rint(dx/box)
divide(dx, box, scratch)
rint(scratch, scratch)
scratch *= box
dx -= scratch
# dy -= box*rint(dy/box)
divide(dy, box, scratch)
rint(scratch, scratch)
scratch *= box
dy -= scratch
r2 = dx**2 + dy**2 # square of dist. between points
Hope that helps:
-tim
|
|
From: Travis O. <oli...@ie...> - 2006-02-21 18:57:55
|
Tim Hochberg wrote: > Yeah, sort of. I meant that the little helper functions that ufuncs > call, such as DOUBLE_multiply, take the same types of arguments. > However, I just realized that I'm not certain that's true -- I just > assumed it because all the one's I've ever seen do. Also, this isn't > really a problem anyway -- the real problem is the slow conversion of > Python scalars to arrays in ufuncs. Yes, that is true. We have only defined multiplication for same-types. But, I just wanted to clarify that the ufunc machinery is more general than that, because others have been confused in the past. > I took a look at this earlier and it appears that the reason that > conversion of Python scalars are slow is that FromAny trys every other > conversion first. The check for Python scalars looks pretty cheap, so > it seems reasonable to check for them and do the appropriate > conversion early. Do the ufunc's call EnsureArray or FromAny? If the > former it would seem pretty straighforward to just stick another check > in there. Then David's original strategy of optimizing in DOUBLE_pow > should be close to as fast as what I'm doing. Yes, I suspect the biggest slow-downs are the two attribute lookups which allow anything with __array__ or the array interface defined to be used. I think we could special-case Python scalars in that code. -Travis |
|
From: Mads I. <mp...@os...> - 2006-02-21 18:24:10
|
On Tue, 21 Feb 2006, Zachary Pincus wrote:
> Mads,
>
> The game with numpy, just as it is with Matlab or any other
> interpreted numeric environment, is to try push as much of the
> looping down into the C code as you can. This is because, as you now
> know, compiled C can loop much faster than interpreted python.
>
> A simple example for averaging 1000 (x,y,z) points:
>
> print data.shape
> (1000, 3)
> # bad: explicit for loop in python
> avg =3D numpy.zeros(3, numpy.float_)
> for i in data: avg +=3D i
> avg /=3D 1000.0
>
> # good: implicit for loop in C
> avg =3D numpy.add.reduce(data, axis =3D 0)
> avg /=3D 1000.0
>
> In your case, instead of explicitly looping through each point, why
> not do the calculations in parallel, operating on entire vectors of
> points at one time? Then the looping is "pushed down" into compiled C
> code. Or if you're really lucky, it's pushed all the way down to the
> vector math units on your cpu if you have a good BLAS or whatever
> installed.
>
> Zach Pincus
>
> Program in Biomedical Informatics and Department of Biochemistry
> Stanford University School of Medicine
>
>
>
> > 2. Here is the main loop for finding all possible pair distances,
> > which corresponds to a loop over the upper triangular part of a
> > square matrix
> >
> > # Loop over all particles
> > for i in range(n-1):
> > dx =3D x[i+1:] - x[i]
> > dy =3D y[i+1:] - y[i]
> >
> > dx -=3D box*rint(dx/box)
> > dy -=3D box*rint(dy/box)
> >
> > r2 =3D dx**2 + dy**2 # square of dist. between points
> >
> > where x and y contain the positions of the particles. A naive
> > implementation in C is
> >
> >
> > // loop over all particles
> > for (int i=3D0; i<n-1; i++) {
> > for (int j=3Di+1; j<n; j++) {
> > dx =3D x[j] - x[i];
> > dy =3D y[j] - y[i];
> >
> > dx -=3D box*rint(dx/box);
> > dy -=3D box*rint(dy/box);
> >
> > r2 =3D dx*dx + dy*dy;
> > }
> > }
> >
> > For n =3D 2500 particles, i.e. 3123750 particle pairs, the C loop is
> > app. 10 times faster than the Python/Numeric counterpart. This is of
> > course not satisfactory.
> >
> > Are there any things I am doing completely wrong here, basic
> > approaches completely misunderstood, misuses etc?
> >
> > Any suggestions, guidelines, hints are most welcome.
> >
> > Best regards,
> >
> > Mads Ipsen
> >
> >
> > +---------------------------------+-------------------------+
> > | Mads Ipsen | |
> > | Dept. of Chemistry | phone: +45-35320220 |
> > | H.C.=D8rsted Institute | fax: +45-35320322 |
> > | Universitetsparken 5 | |
> > | DK-2100 Copenhagen =D8, Denmark | mp...@os... |
> > +---------------------------------+-------------------------+
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through
> > log files
> > for problems? Stop! Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the web. DOWNLOAD
> > SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=103432&bid#0486&dat=12164=
2
> > _______________________________________________
> > Numpy-discussion mailing list
> > Nump
I agree completely with your comments. But, as you can, the innermost part
of the loop has been removed in the code, and replaced with numpy slices.
It's hard for me to see how to compress the outer loop as well, since it
determines the ranges for the inner loop. Unless there is some fancy slice
notation, that allows you to loop over a triangular part of a matrix, ie.
x[i] =3D sum(A[i+1,i])
meaning x[i] =3D sum of elements in i'th row of A using only elements from
position i+1 up to n.
Of course, there is the possibility of hardcoding this in C and then make
it available as a Python module. But I don't want to do this before I am
sure there isn't a numpy way out this.
Let me know, if you have any suggestions.
// Mads
|
|
From: Zachary P. <zp...@st...> - 2006-02-21 17:18:13
|
Mads,
The game with numpy, just as it is with Matlab or any other =20
interpreted numeric environment, is to try push as much of the =20
looping down into the C code as you can. This is because, as you now =20
know, compiled C can loop much faster than interpreted python.
A simple example for averaging 1000 (x,y,z) points:
print data.shape
(1000, 3)
# bad: explicit for loop in python
avg =3D numpy.zeros(3, numpy.float_)
for i in data: avg +=3D i
avg /=3D 1000.0
# good: implicit for loop in C
avg =3D numpy.add.reduce(data, axis =3D 0)
avg /=3D 1000.0
In your case, instead of explicitly looping through each point, why =20
not do the calculations in parallel, operating on entire vectors of =20
points at one time? Then the looping is "pushed down" into compiled C =20=
code. Or if you're really lucky, it's pushed all the way down to the =20
vector math units on your cpu if you have a good BLAS or whatever =20
installed.
Zach Pincus
Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine
> 2. Here is the main loop for finding all possible pair distances,
> which corresponds to a loop over the upper triangular part of a
> square matrix
>
> # Loop over all particles
> for i in range(n-1):
> dx =3D x[i+1:] - x[i]
> dy =3D y[i+1:] - y[i]
>
> dx -=3D box*rint(dx/box)
> dy -=3D box*rint(dy/box)
>
> r2 =3D dx**2 + dy**2 # square of dist. between points
>
> where x and y contain the positions of the particles. A naive
> implementation in C is
>
>
> // loop over all particles
> for (int i=3D0; i<n-1; i++) {
> for (int j=3Di+1; j<n; j++) {
> dx =3D x[j] - x[i];
> dy =3D y[j] - y[i];
>
> dx -=3D box*rint(dx/box);
> dy -=3D box*rint(dy/box);
>
> r2 =3D dx*dx + dy*dy;
> }
> }
>
> For n =3D 2500 particles, i.e. 3123750 particle pairs, the C loop is
> app. 10 times faster than the Python/Numeric counterpart. This is of
> course not satisfactory.
>
> Are there any things I am doing completely wrong here, basic
> approaches completely misunderstood, misuses etc?
>
> Any suggestions, guidelines, hints are most welcome.
>
> Best regards,
>
> Mads Ipsen
>
>
> +---------------------------------+-------------------------+
> | Mads Ipsen | |
> | Dept. of Chemistry | phone: +45-35320220 |
> | H.C.=D8rsted Institute | fax: +45-35320322 |
> | Universitetsparken 5 | |
> | DK-2100 Copenhagen =D8, Denmark | mp...@os... |
> +---------------------------------+-------------------------+
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through =20
> log files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD =20
> SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=103432&bid#0486&dat=121642=
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
|
|
From: Pau G. <pau...@gm...> - 2006-02-21 16:01:16
|
> the closest integer, i.e. for x =3D array([1.1, 1.8]) > > around(x) =3D [1.0, 2.0] > > whereas > > x.astype(int).astype(float) =3D [1.0, 1.0] > (x+0.5).astype(int).astype(float) =3D [1.0, 2.0] i hope it helps, pau |
|
From: Stefan v. d. W. <st...@su...> - 2006-02-21 15:55:44
|
Hi Gary
Thanks for your suggestions. I incorporated them.
St=E9fan
On Wed, Feb 22, 2006 at 01:24:24AM +1100, Gary Ruben wrote:
> 1. I'd put 'assumes from numpy import *' in the preamble.
> 2. Is it possible to change the formatting to make it more obvious what=
=20
> is input and what is output? I think it is better to show the input and=
=20
> output with a standard Python prompt a'la idle or possibly ipython.
> 3. I think it might be worth pointing out that
>=20
> img =3D array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g'=
,F
> loat32),('b',Float32)])
>=20
> is valid syntax that can be replaced by the 2-line version you present.=
=20
> 4. Can you explain dtype=3D(void,12)?
> 5. When the page's name is changed, a link should be put to it in the=20
> 'Getting Started and Tutorial' section of the Documentation page.
|
|
From: Sasha <nd...@ma...> - 2006-02-21 15:31:49
|
On 2/21/06, Mads Ipsen <mp...@os...> wrote: > Maybe I am wrong here, but around() and rint() is supposed to round to > the closest integer, i.e. for x =3D array([1.1, 1.8]) You are right. In the follow-up I've suggested to speed-up the case decimals=3D0 in around in around instead of adding another function. I think that would be a more "pythonic" solution. |
|
From: Mads I. <mp...@os...> - 2006-02-21 15:23:01
|
On Tue, 21 Feb 2006, Alexander Belopolsky wrote: > On 2/21/06, Mads Ipsen <mp...@os...> wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than around: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; Maybe I am wrong here, but around() and rint() is supposed to round to the closest integer, i.e. for x = array([1.1, 1.8]) around(x) = [1.0, 2.0] whereas x.astype(int).astype(float) = [1.0, 1.0] This particular property of around() as well as rint() is crucial for my application. // Mads |
|
From: Bruce S. <bso...@gm...> - 2006-02-21 15:15:20
|
Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais <bb...@br...> wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing neural > simulations. This application is definitely computing-time limited, and = I need to > optimize at least one inner loop of the code, or perhaps even rethink the= algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I have= it as an > array, and I choose columns from that array randomly. in other cases, I = may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and a w= eight > vector, "w", so > > y=3Ddot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=3Dw+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is amenable= to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop steps,= takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work very = fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code as= a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that ther= e is > something I am doing wrong, because the C-code for both should run compar= ably. > Perhaps the approach is wrong? I'm willing to take any suggestions! I d= on't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I want = to be able > to use the Enthought version for Windows. I develop on Linux, and haven'= t had a > chance to see if I can compile numpy using the Enthought Python for Windo= ws. > > If there is anything else anyone needs to know, I'll post it. I put the = main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bb...@br... > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=3Drandom((100,1000)) # 1000 input vectors > > numpats=3Dx.shape[0] > w=3Drandom((numpats,1)); > > th=3Drandom((1,1)) > > params=3D{} > params['eta']=3D0.001; > params['tau']=3D100.0; > old_mx=3D0; > for e in range(100): > > rnd=3Drandint(0,numpats,250000) > t1=3Dtime.time() > if 0: # straight python > for i in range(len(rnd)): > pat=3Drnd[i] > xx=3Dreshape(x[:,pat],(1,-1)) > y=3Dmatrixmultiply(xx,w) > w=3Dw+params['eta']*(y*transpose(xx)-y**2*w); > th=3Dth+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > #=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=3Dparams['eta'] # learning rate > tau=3Dparams['tau'] # used for variance estimate > > cdef double y > num_iterations=3Drnd.dimensions[0] > num_inputs=3Dw.dimensions[0] > > # get the pointers > wp=3D<double *>w.data > xp=3D<double *>X.data > rndp=3D<int *>rnd.data > thp=3D<double *>th.data > > for it from 0 <=3D it < num_iterations: > > offset=3Drndp[it]*num_inputs > > # calculate the output > y=3D0.0 > for i from 0 <=3D i < num_inputs: > y=3Dy+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <=3D i < num_inputs: > wp[i]=3Dwp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=3Dthp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat= =3D121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
|
From: Travis N. V. <tr...@en...> - 2006-02-21 15:11:42
|
Travis N. Vaught wrote: > > I've renamed it. Now the page is at: > > http://www.scipy.org/RecordArray > Doh! That should have been http://www.scipy.org/RecordArrays . |
|
From: Travis N. V. <tr...@en...> - 2006-02-21 15:05:10
|
Stefan van der Walt wrote: > I wrote a short tutorial on using record arrays, which can be found at > > http://www.scipy.org/ArrayRecords > > The page is named ArrayRecords instead of RecordArrays, so I'd be glad > if someone with priviledges could rename it. Also, please fix any > mistakes I might have made. > ... I've renamed it. Now the page is at: http://www.scipy.org/RecordArray Travis |
|
From: Sasha <nd...@ma...> - 2006-02-21 14:59:07
|
On the second thought, the difference between around and astype is not surprising because around operates in terms of decimals. Rather than adding rint, I would suggest to make a special case decimals=3D0 use C rint. > On 2/21/06, Mads Ipsen <mp...@os...> wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than a= round: > > > python -m timeit -s "from numpy import array, around; x =3D array([1.5]= *1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x =3D array([1.5]= *1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; x =3D array([1.5]= *1000)" "x.astype(int).astype(float)" > 100000 loops, best of 3: 6.48 usec per loop > |
|
From: Alexander B. <ale...@gm...> - 2006-02-21 14:51:31
|
On 2/21/06, Mads Ipsen <mp...@os...> wrote: > I suggest that rint() is added as a ufunc or is there any concerns > here that I am not aware of? You might want to use astype(int). On my system it is much faster than aro= und: > python -m timeit -s "from numpy import array, around; x =3D array([1.5]*1= 000)" "around(x)" 10000 loops, best of 3: 176 usec per loop > python -m timeit -s "from numpy import array, around; x =3D array([1.5]*1= 000)" "x.astype(int)" 100000 loops, best of 3: 3.2 usec per loop the difference is too big to be explained by the fact that around allocates twice as much memory for the result. In fact the following equivalent of rint is still very fast: > python -m timeit -s "from numpy import array, around; x =3D array([1.5]*1= 000)" "x.astype(int).astype(float)" 100000 loops, best of 3: 6.48 usec per loop |
|
From: Gary R. <gr...@bi...> - 2006-02-21 14:24:36
|
Thanks for this Stéfan,
Can I make some observations. I don't want to just change your
formatting. I think it would be good to have some discussion about the
formatting used in tutorials like this, because all should probably
follow a standard presentation style.
I like the usage summary at the end.
1. I'd put 'assumes from numpy import *' in the preamble.
2. Is it possible to change the formatting to make it more obvious what
is input and what is output? I think it is better to show the input and
output with a standard Python prompt a'la idle or possibly ipython.
A couple of things specific to your examples:
3. I think it might be worth pointing out that
img = array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g',F
loat32),('b',Float32)])
is valid syntax that can be replaced by the 2-line version you present.
Should the valid syntax for creating a record array be presented in EBNF
format?
4. Can you explain dtype=(void,12)?
5. When the page's name is changed, a link should be put to it in the
'Getting Started and Tutorial' section of the Documentation page.
What do you and others think?
Gary R.
Stefan van der Walt wrote:
> I wrote a short tutorial on using record arrays, which can be found at
>
> http://www.scipy.org/ArrayRecords
>
> The page is named ArrayRecords instead of RecordArrays, so I'd be glad
> if someone with priviledges could rename it. Also, please fix any
> mistakes I might have made.
>
> Regards
> Stéfan
|
|
From: Mads I. <mp...@os...> - 2006-02-21 14:17:15
|
Hey,
I am relatively new to python and Numeric, and am currently involved
in a project of developing molecular dynamics code written in python
and using Numeric for number crunching. During this, I've run into
some problems, that I hope I can get some assistance with here. My
main issues here are:
1. around() appears to be slow
2. C code appears to be much faster
1. One of the bottle necks in MD is the calculation of the distances
between all particle pairs.
In MD simulations with periodic boundary conditions, you have to
estimate the shortest distance between all the particle pairs in your
system. Fx. on a line of length box =3D 10, the distance dx between two
points x0 =3D 1 and x1 =3D 9, will be dx =3D -2 (and NOT dx =3D 9). One way=
to
do this in numpy is
dx =3D x1 - x0
dx -=3D around(dx/box)
My first observation here, is that around() seems to be very slow. So
I looked in umathmodule.c and implemented rint() from the C math
library and made my own custom Numeric module. This gives a speed-up
by a factor of app. 4 compared to around().
I suggest that rint() is added as a ufunc or is there any concerns
here that I am not aware of?
2. Here is the main loop for finding all possible pair distances,
which corresponds to a loop over the upper triangular part of a
square matrix
# Loop over all particles
for i in range(n-1):
dx =3D x[i+1:] - x[i]
dy =3D y[i+1:] - y[i]
dx -=3D box*rint(dx/box)
dy -=3D box*rint(dy/box)
r2 =3D dx**2 + dy**2 # square of dist. between points
where x and y contain the positions of the particles. A naive
implementation in C is
// loop over all particles
for (int i=3D0; i<n-1; i++) {
for (int j=3Di+1; j<n; j++) {
dx =3D x[j] - x[i];
dy =3D y[j] - y[i];
dx -=3D box*rint(dx/box);
dy -=3D box*rint(dy/box);
r2 =3D dx*dx + dy*dy;
}
}
For n =3D 2500 particles, i.e. 3123750 particle pairs, the C loop is
app. 10 times faster than the Python/Numeric counterpart. This is of
course not satisfactory.
Are there any things I am doing completely wrong here, basic
approaches completely misunderstood, misuses etc?
Any suggestions, guidelines, hints are most welcome.
Best regards,
Mads Ipsen
+---------------------------------+-------------------------+
| Mads Ipsen | |
| Dept. of Chemistry | phone: +45-35320220 |
| H.C.=D8rsted Institute | fax: +45-35320322 |
| Universitetsparken 5 | |
| DK-2100 Copenhagen =D8, Denmark | mp...@os... |
+---------------------------------+-------------------------+
|
|
From: Sven S. <sve...@gm...> - 2006-02-21 13:42:10
|
Hi, sometimes I'm still struggling with peculiarities of numpy-arrays vs. numpy-matrices; my latest story goes like this: I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I can just manage to understand the resulting shape ( == (112,) ). Then I slice a column from a numpy-matrix b = somematrix[:,1] and get the expected (112,1) shape. Then I do what I thought was the easiest thing in the world, I subtract the two vectors: c = a - b I was very surprised by the bug that showed up due to the fact that c.shape == (112,112) !! First conclusion: broadcasting is nice and everything, but here I somehow think that it shouldn't be like this. I like numpy, but this is frustrating. Next, I try to workaround by b.squeeze(). That seems to work, but why is b.squeeze().shape == (1, 112) instead of (112,)? Then I thought maybe b.flattened() does the job, but then I get an error (matrix has no attr flattened). Again, I'm baffled. Could someone please explain? I already own the numpy-book, otherwise I wouldn't even have thought of using those methods, but here it hasn't enlightened me. Second (preliminary) conclusion: I will paranoically use even more asmatrix()-conversions in my code to avoid dealing with those array-beasts ;-) and get column vectors I can trust... Is there a better general advice than to say: "numpy-matrices and numpy-arrays are best kept in separated worlds" ? Thanks for any insights, Sven |
|
From: Stefan v. d. W. <st...@su...> - 2006-02-21 12:28:12
|
I wrote a short tutorial on using record arrays, which can be found at http://www.scipy.org/ArrayRecords The page is named ArrayRecords instead of RecordArrays, so I'd be glad if someone with priviledges could rename it. Also, please fix any mistakes I might have made. Regards St=E9fan |
|
From: Brian B. <bb...@br...> - 2006-02-21 12:26:01
|
Hello,
I am trying to translate some Matlab/mex code to Python, for doing neural
simulations. This application is definitely computing-time limited, and I need to
optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
The procedure is very simple, after initializing any variables:
1) select a random input vector, which I will call "x". right now I have it as an
array, and I choose columns from that array randomly. in other cases, I may need to
take an image, select a patch, and then make that a column vector.
2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so
y=dot(x,w)
3) modify the weight vector based on a matrix equation, like:
w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant
4) repeat steps 1-3 many times
I've organized it like:
for e in 100: # outer loop
for i in 1000: # inner loop
(steps 1-3)
display things.
so that the bulk of the computation is in the inner loop, and is amenable to
converting to a faster language. This is my issue:
straight python, in the example posted below for 250000 inner-loop steps, takes 20
seconds for each outer-loop step. I tried Pyrex, which should work very fast on such
a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex
file in matlab takes 1.5 seconds per outer-loop step.
Given the huge difference between the Pyrex and the Mex, I feel that there is
something I am doing wrong, because the C-code for both should run comparably.
Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind
coding some in C, but the Python API seemed a bit challenging to me.
One note: I am using the Numeric package, not numpy, only because I want to be able
to use the Enthought version for Windows. I develop on Linux, and haven't had a
chance to see if I can compile numpy using the Enthought Python for Windows.
If there is anything else anyone needs to know, I'll post it. I put the main script,
and a dohebb.pyx code below.
thanks!
Brian Blais
--
-----------------
bb...@br...
http://web.bryant.edu/~bblais
# Main script:
from dohebb import *
import pylab as p
from Numeric import *
from RandomArray import *
import time
x=random((100,1000)) # 1000 input vectors
numpats=x.shape[0]
w=random((numpats,1));
th=random((1,1))
params={}
params['eta']=0.001;
params['tau']=100.0;
old_mx=0;
for e in range(100):
rnd=randint(0,numpats,250000)
t1=time.time()
if 0: # straight python
for i in range(len(rnd)):
pat=rnd[i]
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w=w+params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
else: # pyrex
dohebb(params,w,th,x,rnd)
print time.time()-t1
p.plot(w,'o-')
p.xlabel('weights')
p.show()
#=============================================
# dohebb.pyx
cdef extern from "Numeric/arrayobject.h":
struct PyArray_Descr:
int type_num, elsize
char type
ctypedef class Numeric.ArrayType [object PyArrayObject]:
cdef char *data
cdef int nd
cdef int *dimensions, *strides
cdef object base
cdef PyArray_Descr *descr
cdef int flags
def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):
cdef int num_iterations
cdef int num_inputs
cdef int offset
cdef double *wp,*xp,*thp
cdef int *rndp
cdef double eta,tau
eta=params['eta'] # learning rate
tau=params['tau'] # used for variance estimate
cdef double y
num_iterations=rnd.dimensions[0]
num_inputs=w.dimensions[0]
# get the pointers
wp=<double *>w.data
xp=<double *>X.data
rndp=<int *>rnd.data
thp=<double *>th.data
for it from 0 <= it < num_iterations:
offset=rndp[it]*num_inputs
# calculate the output
y=0.0
for i from 0 <= i < num_inputs:
y=y+wp[i]*xp[i+offset]
# change in the weights
for i from 0 <= i < num_inputs:
wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i])
# estimate the variance
thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])
|
|
From: Pearu P. <pe...@sc...> - 2006-02-21 08:54:27
|
Hi,
Question: what is the recommended way to compare two array objects? And
when they are contained in a tuple or list or dictinary or etc.?
I ask because I found that arr1.__eq__(arr2) can return either bool or an
array of bools when shape(arr1)!=shape(arr2) or shape(arr1)==shape(arr2),
respectively:
>>> array([1,2])==array([1,0,0])
False
>>> array([1,2])==array([1,0])
array([True, False], dtype=bool)
I wonder if numpy users are happy with that? Shouldn't arr1==arr2 return
bool as well because current __eq__ behaviour is handled by equal()
function when the shapes are equal?
Note that if __eq__ would always return bool then the following codes
would work as I would expect:
>>> (1,array([1,2]))==(1,array([1,2]))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
>>> # I would expect True, compare with
>>> (1,[1,2])==(1,[1,2])
True
>>> (1,array([1,2]))==(1,array([1,2,0]))
False
I started to write this message because
object1 == object2
returns boolean for (all?) Python builtin objects but as soon
as these objects contain arrays, the test will fail with an exception.
May be numpy needs equalobjs(obj1,obj2) that always returns boolean
and can handle comparing objects like {1:array([1,2])}, [3,[array([2,2])],
etc.
Pearu
|
|
From: Travis O. <oli...@ie...> - 2006-02-21 03:44:35
|
Tim Hochberg wrote: >> Hmm, ufuncs are passed a void* argument for passing info to them. Now, >> what that argument is defined when the ufunc is created, but maybe >> there's a way to piggy-back on it. >> >> > Yeah, I really felt like I was fighting the ufuncs when I was playing > with this. On the one hand, you really want to use the ufunc > machinery. On the other hand that forces you into using the same types > for both arguments. This is not true. Ufuncs can have different types for their arguments. Perhaps you meant something else? > Just to add a little more confusion to the mix. I did a little testing > to see how close pow(x,n) and x*x*... actually are. They are slightly > less close for small values of N and slightly closer for large values > of N than I would have expected. The upshot of this is that integer > powers between -2 and +4 all seem to vary by the same amount when > computed using pow(x,n) versus multiplies. I'm including the test code > at the end. Assuming that this result is not a fluke that expands the > noncontroversial set by at least 3 more values. That's starting to > strain the ufunc aproach, so perhaps optimizing in @TYP@_power is the > way to go after all. Or, more likely, adding @TYP@_int_power or maybe > @TYP@_fast_power (so as to be able to include some half integer > powers) and dispatching appropriately from array_power. > > The problem here, of course, is the overhead that PyArray_EnsureArray > runs into. I'm not sure if the ufuncs actually call that, but I was > using that to convert things to arrays at one point and I saw the > slowdown, so I suspect that the slowdown is in something > PyArray_EnsureArray calls if not in that routine itself. I'm afraid to > dig into that stuff though.. On the other hand, it would probably > speed up all kinds of stuff if that was sped up. EnsureArray simply has some short cuts and then calls PyArray_FromAny. PyArray_FromAny is the big array conversion code. It converts anything (it can) to an array. >> Too bad we couldn't make a function generator :-) [Well, we could using >> weave...]\ >> >> > Yaigh! That's actually an interesting approach that could use some attention.. -Travis |