You can subscribe to this list here.
| 2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
| 2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
| 2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
| 2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
| 2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
| 2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
|
From: Francesc A. <fa...@ca...> - 2006-01-16 12:43:23
|
Hi Travis,
El ds 14 de 01 del 2006 a les 20:45 -0700, en/na Travis Oliphant va
escriure:
> 1) Replacing attributes that are now gone:
>=20
> .dtypechar --> .dtype.char
> .dtypestr --> .dtype.str
> .dtypedescr --> .dtype
>=20
> 2) Changing old .dtype -> .dtype.type
>=20
> This is only necessary if you were using a.dtype as a *typeobject*=20
> as in
> issubclass(a.dtype, <some scalar type>)
>=20
> If you were using .dtype as a parameter to dtype=3D then that usage=20
> will still work
> great (in fact a little faster) because now .dtype returns a=20
> "descriptor object"
>=20
> 3) The dtypedescr constructor is now called dtype.
>=20
> This change should have gone into the 0.9.2 release, but things got too=20
> hectic with all the name changes. I will quickly release 0.9.4 with=20
> these changes unless I hear strong disagreements within the next few days=
.
Glad that you liked the suggestion :-).
BTW, if you are going to release 0.9.4 soon, do not forget to correct
this typo:
--- python.nobackup/numpy/trunk/numpy/core/numerictypes.py
2006-01-16 11:07:13.000000000 +0100
+++ /usr/lib/python2.4/site-packages/numpy/core/numerictypes.py
2006-01-16 11:59:28.000000000 +0100
@@ -167,7 +167,7 @@
typeDict[tmpstr] =3D typeobj
na_name =3D tmpstr
elif base =3D=3D 'complex':
- na_num =3D '%s%d' % (base.capitalize(), bit/2)
+ na_name =3D '%s%d' % (base.capitalize(), bit/2)
elif base =3D=3D 'bool':
na_name =3D base.capitalize()
typeDict[na_name] =3D typeobj
Thanks,
--=20
>0,0< Francesc Altet http://www.carabos.com/
V V C=E1rabos Coop. V. Enjoy Data
"-"
|
|
From: Francesc A. <fa...@ca...> - 2006-01-16 11:37:28
|
Hi,
I've downloaded a fresh SVN version of numpy (0.9.4.1914) and got a bit
surprised becuase of this:
>>> numpy.reshape(numpy.array((22,)), (1,)*20)
array([[[[[[[[[[[[[[[[[[[[22]]]]]]]]]]]]]]]]]]]])
>>> numpy.reshape(numpy.array((22,)), (1,)*21)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/site-packages/numpy/core/oldnumeric.py", line
164, in reshape
return asarray(a).reshape(newshape)
ValueError: sequence too large; must be smaller than 20
Before, I think numpy supported up to 32 dimensions. Is there any reason
for this new limit? Just curious.
Cheers,
--=20
>0,0< Francesc Altet http://www.carabos.com/
V V C=E1rabos Coop. V. Enjoy Data
"-"
|
|
From: Arnd B. <arn...@we...> - 2006-01-16 07:24:04
|
On Mon, 16 Jan 2006, David M. Cooke wrote: > On Jan 15, 2006, at 14:23 , Zachary Pincus wrote: > > >>> Or is this a policy change which puts the headers in the site- > >>> packages directory? > >> > >> This is not an error. Use numpy.get_numpy_include() to retrive the > >> directory of numpy header files. See numpy.get_numpy_include.__doc__ > >> for more information. > > > > Thanks for the information. Unfortunately, the package I'm working > > on can't be built with distutils, It seems that distutils is not loved that much (E.g. Wax no longer uses distutils http://zephyrfalcon.org/weblog2/arch_e10_00870.html#e878 ) > > so it may wind up being something > > of a contortion to call numpy.get_numpy_include() from the build > > system I need to use. > > You don't say what type of build system it is. If it's autoconf or > Makefile based, you could do something like > > NUMPY_HEADERS=$(python -c 'import numpy; print numpy.get_numpy_include > ()') > > > Understanding that the optimal, preferred method of finding the > > include dir will always be numpy.get_numpy_include(), if I must > > resort to a suboptimal method of guessing likely locations, what > > will those locations be? > > > > Are they generally within the site-packages directory? Sometimes in > > the python include directory? Will they be ever-changing? > > For now, they're in the site-packages directory, under numpy/core/ > include. That position might change (it used to be numpy/base/ > include, for instance), if we rename modules again. They probably > won't migrate to the python include directory for the reason they're > not in there now (non-root users can't install there). If I do a `python setup.py --prefix=$HOME/my_numpy install`, this leads to ~/my_numpy/bin/f2py ~/my_numpy/lib/python2.3/site-packages/numpy/* Wouldn't then a ~/my_numpy/include/numpy/* be reasonable? Presumably I am overlooking something, and I know that this has been discussed in much detail before, but still ... Best, Arnd |
|
From: David M. C. <co...@ph...> - 2006-01-16 07:07:04
|
On Jan 15, 2006, at 14:23 , Zachary Pincus wrote: >>> Or is this a policy change which puts the headers in the site- >>> packages >>> directory? >> >> This is not an error. Use numpy.get_numpy_include() to retrive the >> directory of numpy header files. See numpy.get_numpy_include.__doc__ >> for more information. > > Thanks for the information. Unfortunately, the package I'm working > on can't be built with distutils, so it may wind up being something > of a contortion to call numpy.get_numpy_include() from the build > system I need to use. You don't say what type of build system it is. If it's autoconf or Makefile based, you could do something like NUMPY_HEADERS=$(python -c 'import numpy; print numpy.get_numpy_include ()') > Understanding that the optimal, preferred method of finding the > include dir will always be numpy.get_numpy_include(), if I must > resort to a suboptimal method of guessing likely locations, what > will those locations be? > > Are they generally within the site-packages directory? Sometimes in > the python include directory? Will they be ever-changing? For now, they're in the site-packages directory, under numpy/core/ include. That position might change (it used to be numpy/base/ include, for instance), if we rename modules again. They probably won't migrate to the python include directory for the reason they're not in there now (non-root users can't install there). -- |>|\/|< /------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
|
From: Sasha <nd...@ma...> - 2006-01-16 05:39:52
|
It looks like NumPy and Numeric handle invalid integer operations different= ly: >>> from numpy import * >>> import Numeric as n >>> array(2)**40 2147483647 >>> array(2)/0 0 In numeric both fail: >>> n.array(2)**40 Traceback (most recent call last): File "<stdin>", line 1, in ? ArithmeticError: Integer overflow in power. >>> n.array(2)/0 Traceback (most recent call last): File "<stdin>", line 1, in ? ZeroDivisionError: divide by zero I don't know if the change was intentional, but if it was, I think it should be advertised. -- sasha |
|
From: Sasha <nd...@ma...> - 2006-01-16 03:16:33
|
It looks like NumPy and Numeric handle invalid integer operations different= ly: >>> from numpy import * >>> import Numeric as n >>> array(2)**40 2147483647 >>> array(2)/0 0 In numeric both fail: >>> n.array(2)**40 Traceback (most recent call last): File "<stdin>", line 1, in ? ArithmeticError: Integer overflow in power. >>> n.array(2)/0 Traceback (most recent call last): File "<stdin>", line 1, in ? ZeroDivisionError: divide by zero I don't know if the change was intentional, but if it was, I think it should be advertised. -- sasha |
|
From: Zachary P. <zp...@st...> - 2006-01-15 19:24:00
|
>> Or is this a policy change which puts the headers in the site- >> packages >> directory? > > This is not an error. Use numpy.get_numpy_include() to retrive the > directory of numpy header files. See numpy.get_numpy_include.__doc__ > for more information. Thanks for the information. Unfortunately, the package I'm working on can't be built with distutils, so it may wind up being something of a contortion to call numpy.get_numpy_include() from the build system I need to use. Understanding that the optimal, preferred method of finding the include dir will always be numpy.get_numpy_include(), if I must resort to a suboptimal method of guessing likely locations, what will those locations be? Are they generally within the site-packages directory? Sometimes in the python include directory? Will they be ever-changing? I will endeavor to use the more proper methods in my system, but this information would be useful as a fallback. Thanks, Zach |
|
From: Alan G I. <ai...@am...> - 2006-01-15 13:13:19
|
numpy's use of `round_` rather than `round` feels like a blemish. Was this done just to protect the built-in? If so, is that a good enough reason? fwiw, Alan Isaac |
|
From: <pe...@ce...> - 2006-01-15 12:23:19
|
On Sun, 15 Jan 2006, Zachary Pincus wrote: > One question: where should I expect the arrayobject.h file to be > installed? numpy/arrayobject.h is installed to numpy.get_numpy_include(). > My old installation of Numeric placed the arrayobject.h (etc.) header > files here: > /Library/Frameworks/Python.framework/Versions/2.4/include/python2.4/ > Numeric/ > > I'm on OS X, using the standard 'framework' build of python, which > accounts for the particular install path. However, this is the proper > include path for python-related C headers. > > When I installed numpy, arrayobject.h was placed here, in the site- > packages directory: > /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site- > packages/numpy/core/include/numpy/arrayobject.h > > My question: is this an error (possibly specific to OS X) and the > numpy headers should have been placed with the other C headers? Or is > this a policy change which puts the headers in the site-packages > directory? This is not an error. Use numpy.get_numpy_include() to retrive the directory of numpy header files. See numpy.get_numpy_include.__doc__ for more information. HTH, Pearu |
|
From: Zachary P. <zp...@st...> - 2006-01-15 08:23:07
|
Hello folks, I'm quite excited about the new release of numpy -- thanks for all the work. One question: where should I expect the arrayobject.h file to be installed? My old installation of Numeric placed the arrayobject.h (etc.) header files here: /Library/Frameworks/Python.framework/Versions/2.4/include/python2.4/ Numeric/ I'm on OS X, using the standard 'framework' build of python, which accounts for the particular install path. However, this is the proper include path for python-related C headers. When I installed numpy, arrayobject.h was placed here, in the site- packages directory: /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site- packages/numpy/core/include/numpy/arrayobject.h My question: is this an error (possibly specific to OS X) and the numpy headers should have been placed with the other C headers? Or is this a policy change which puts the headers in the site-packages directory? I'm writing some code which I hope to be compatible with Numeric, numarray, and numpy; being able to reliably find arrayobject.h is pretty important for this. Thus, it would be good to know where, in general, this file will be installed by numpy. Thanks, Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine |
|
From: Travis O. <oli...@ie...> - 2006-01-15 05:58:01
|
Travis Oliphant wrote:
> Paulo J. S. Silva wrote:
>
>> Numpy:
>>
>> In [27]:i = time.clock(); bench(A,b); time.clock() - i
>> Out[27]:10.610000000000014
>>
>>
>> Why is numpy so slow??????
>>
>>
>>
>
> I think the problem here is that using the properties here to take
> advantage of the nice matrix math stuff is slower than just computing
> the dot product in the fastest possible way with raw arrays. I've
> been concerned about this for awhile. The benchmark below makes my
> point.
>
> While a matrix is a nice thing, I think it will always be slower....
> It might be possible to speed it up and I'm open to suggestions...
Indeed it was possible to speed things up as Paulo pointed out, by using
the correct blas calls for the real size of the array.
With some relatively simple modifications, I was able to get significant
speed-ups in time using matrices:
import timeit
t1 = timeit.Timer('c = b.T*A; d=c*b','from numpy import rand,mat; A =
mat(rand(1000,1000));b = mat(rand(1000,1))')
t2 = timeit.Timer('c = dot(b,A); d=dot(b,c)','from numpy import rand,
dot; A = rand(1000,1000);b = rand(1000)')
t1.timeit(100)
1.4369449615478516
t2.timeit(100)
1.2983191013336182
Now, that looks more like a 10% overhead for the matrix class.
-Travis
|
|
From: Travis O. <oli...@ie...> - 2006-01-15 03:46:40
|
There was some cruft left over from the change to making data-type
descriptors real Python objects. This left lots of .dtype related
attributes on the array object --- too many as Francesc Altet graciously
pointed out.
In the latest SVN, I've cleaned things up (thanks to a nice patch from
Francesc to get it started). Basically, there is now only one
attribute on the array object dealing with the data-type (arr.dtype).
This attribute returns the data-type descriptor object for the array.
This object itself has the attributes .char, .str, and .type (among
others).
I think this will lead to less confusion long term. The cruft was due
to the fact that my understanding of the data-type descriptor came in
December while seriously looking at records module.
This will have some backward-compatibility issues (we are still pre-1.0
and early enough that I hope this is not too difficult to deal with).
The compatibility to numpy-0.9.2 issues I can see are:
1) Replacing attributes that are now gone:
.dtypechar --> .dtype.char
.dtypestr --> .dtype.str
.dtypedescr --> .dtype
2) Changing old .dtype -> .dtype.type
This is only necessary if you were using a.dtype as a *typeobject*
as in
issubclass(a.dtype, <some scalar type>)
If you were using .dtype as a parameter to dtype= then that usage
will still work
great (in fact a little faster) because now .dtype returns a
"descriptor object"
3) The dtypedescr constructor is now called dtype.
This change should have gone into the 0.9.2 release, but things got too
hectic with all the name changes. I will quickly release 0.9.4 with
these changes unless I hear strong disagreements within the next few days.
-Travis
P.S.
SciPy SVN has been updated and fixed with the changes.
Numeric compatibility now implies that .typecode() --> .dtype.char
although if .typecode() was used as an argument to a function, then
.dtype will very likely work.
-Travis
|
|
From: Paulo J. S. S. <pjs...@im...> - 2006-01-14 13:51:40
|
Em Sex, 2006-01-13 às 21:40 -0700, Travis Oliphant escreveu:
> Paulo J. S. Silva wrote:
>
> >Numpy:
> >
> >In [27]:i = time.clock(); bench(A,b); time.clock() - i
> >Out[27]:10.610000000000014
> >
> >
> >Why is numpy so slow??????
> >
> >
> >
>
> I think the problem here is that using the properties here to take
> advantage of the nice matrix math stuff is slower than just computing
> the dot product in the fastest possible way with raw arrays. I've been
> concerned about this for awhile. The benchmark below makes my point.
>
> While a matrix is a nice thing, I think it will always be slower.... It
> might be possible to speed it up and I'm open to suggestions...
>
> To see what I mean, try this....
>
Travis,
I think I got the gotcha!
The problem is the in the dot function. I have only skimmed on the
_dotblas code, but I assume it try to find the right blas function based
on the arguments ranks. If both arguments are matrices then both are
rank two and the matrix-multiply blas function is called. This is "very"
suboptimal if some of the matrices objects actually represent a vector
or a scalar.
The same problem would appear with pure arrays if one insists on using a
column vector:
In [3]:t1 = timeit.Timer('c = dot(A,b)','from numpy import rand, dot; A
= rand(1000,1000);b = rand(1000)')
In [4]:t2 = timeit.Timer('c = A*b','from numpy import rand, dot, mat; A
= mat(rand(1000,1000));b = mat(rand(1000,1))')
In [5]:t3 = timeit.Timer('c = dot(A,b)','from numpy import rand, dot,
transpose; A = rand(1000,1000);b = rand(1000,1)')
In [6]:t1.timeit(100)
Out[6]:0.69448995590209961
In [7]:t2.timeit(100)
Out[7]:1.1080958843231201
You see? The third test with pure arrays is as slow as the matrix based
test.
The problem is even more dramatic if you are trying to compute an inner
product:
In [13]:t1 = timeit.Timer('c = dot(b,b)','from numpy import rand, dot; b
= rand(1000)')
In [14]:t2 = timeit.Timer('c = b.T*b','from numpy import rand, mat; b =
mat(rand(1000,1))')
In [15]:t3 = timeit.Timer('c = dot(transpose(b),b)','from numpy import
rand, dot, transpose; b = rand(1000,1)')
In [16]:t1.timeit(10000)
Out[16]:0.053219079971313477
In [17]:t2.timeit(10000)
Out[17]:0.65550899505615234
In [18]:t3.timeit(10000)
Out[18]:0.62446498870849609
Note that this is a very serious drawback for matrices objects as the
most usual operation, matrix-vector multiplication, in numerical linear
algebra algorithms is always suboptimal.
The solution may be a "smarter" analysis in _dotblas.c that not only
looks at the rank but that also "collapses" dimensions with only one
element to decide which blas function to call. Note that this approach
may be used to solve the problem with 1x1 matrices if properly coded.
Some special care has to be taken to identify outer products to.
I may try to look at this myself if you want. But certainly only in
three weeks. I am preparing for an exam to confirm (make permanent) my
position at the university that will take place in the last days of
January.
Best,
Paulo
|
|
From: Travis O. <oli...@ie...> - 2006-01-14 04:40:28
|
Paulo J. S. Silva wrote:
>Numpy:
>
>In [27]:i = time.clock(); bench(A,b); time.clock() - i
>Out[27]:10.610000000000014
>
>
>Why is numpy so slow??????
>
>
>
I think the problem here is that using the properties here to take
advantage of the nice matrix math stuff is slower than just computing
the dot product in the fastest possible way with raw arrays. I've been
concerned about this for awhile. The benchmark below makes my point.
While a matrix is a nice thing, I think it will always be slower.... It
might be possible to speed it up and I'm open to suggestions...
To see what I mean, try this....
import timeit
t1 = timeit.Timer('c = b.T*A; d=c*b','from numpy import rand,mat; A =
mat(rand(1000,1000));b = mat(rand(1000,1))')
t2 = timeit.Timer('c = dot(b,A); d=dot(b,c)','from numpy import rand,
dot; A = rand(1000,1000);b = rand(1000)')
>>> t1.timeit(100)
6.0398328304290771
>>> t2.timeit(100)
1.2430641651153564
So, using raw arrays and dot product is 5x faster in this case.....
-Travis
|
|
From: Andrew S. <str...@as...> - 2006-01-14 03:16:56
|
Python eggs also support runtime version selection. It's possible we may
need to backport egg support to old scipy, but it already works great with
new numpy/scipy. I'd suggest this over re-implementing another wheel.
> Travis Oliphant wrote:
>
>> In order to install new and old scipy you will definitely need to
>> install one of them to another location besides site-packages (probably
>> new scipy).
>> Then you will need to make sure your sys.path is set up properly to find
>> the scipy you are interested in using for that session.
>
> It sounds like SciPy could use a versioning scheme much like wxPythons:
>
> import wxversion
> wxversion.select("2.6")
> import wx
>
> See:
>
> http://wiki.wxpython.org/index.cgi/MultiVersionInstalls
>
>
> In fact, you could probably just grab the wxPython code and tweek it a
> little for SciPy.
>
> This was debated a lot on the wxPython lists before being implemented.
> After all you could "Just" write a few start-up scripts that manipulate
> PYTHONPATH, or re-name some directories, or put in a few sym links,
> or, or, or... Also, ideally wxPython major versions are compatible, etc,
> etc.
>
> However, when all was said and Dunn, this is a very nice system that
> works the same way on all platforms, and doesn't get in the way of
> anything. New versions get installed as the default, so if you never use
> wxversion, you never know it's there. If you do, then you can test new
> versions without having to break any old, running utilities, etc.
>
> Also, the infrastructure is in place for future major version changes,
> etc.
>
> -Chris
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> NOAA/OR&R/HAZMAT (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
>
> Chr...@no...
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
|
|
From: Perry G. <pe...@st...> - 2006-01-13 19:38:01
|
On Jan 13, 2006, at 2:07 PM, Russel Howe wrote: > In the session below, I expected the for loop and the index array to > have the same behavior. Is this behavior by design? Is there some > other way to get the behavior of the for loop? The loop is too slow > for my application ( len(ar1) == 18000). > Russel This sort of usage of index arrays is always going to be a bit confusing and this is a common example of that. Anytime you are using repeated indices for index assignment, you are not going to get what you would naively think. It's useful to think of what is going on in a little more detail. Your use of index arrays is resulting in the elements you selected generating a 10 element array which is added to the random elements. Initially it is a 10 element array with all zero elements, and after the addition, it equals the random array elements. Then, the index assignment takes place. First, the first element of the summed array is assigned to 0, then the second element of the summed array is assigned to 0, and that is the problem. The summing is done before the assignment. Generally the last index of a repeated set is what is assigned as the final value. It is possible to do what you want without a for loop, but perhaps not as fast as it would be in C. One way to do it is to sort the indices in increasing order, generate the corresponding selected value array and then use accumulated sums to derive the sums corresponding to each index. It's a bit complicated, but can be much faster than a for loop. See example 3.7.4 to see the details of how this is done in our tutorial: http://www.scipy.org/wikis/topical_software/Tutorial Maybe someone has a more elegant, faster or clever way to do this that I've overlooked. I've seen this come up enough that it may be useful to provide a special function to make this easier to do. Perry Greenfield > Python 2.4.2 (#1, Nov 29 2005, 08:43:33) > [GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> from numarray import * > >>> import numarray.random_array as ra > >>> print libnumarray.__version__ > 1.5.0 > >>> ar1=ra.random(10) > >>> ar2=zeros(5, type=Float32) > >>> ind=array([0,0,1,1,2,2,3,3,4,4]) > >>> ar2[ind]+=ar1 > >>> ar2 > array([ 0.09791247, 0.26159889, 0.89386773, 0.32572687, > 0.86001897], type=Float32) > >>> ar1 > array([ 0.49895534, 0.09791247, 0.424059 , 0.26159889, 0.29791802, > 0.89386773, 0.44290054, 0.32572687, 0.53337622, > 0.86001897]) > >>> ar2*=0.0 > >>> for x in xrange(len(ind)): > ... ar2[ind[x]]+=ar1[x] > ... > >>> ar2 > array([ 0.5968678 , 0.68565786, 1.19178581, 0.76862741, > 1.39339519], type=Float32) > >>> > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
|
From: Paulo J. S. S. <pjs...@im...> - 2006-01-13 19:17:03
|
> I would be interested to see how a raw-array solution compares.
> There
> is going to be some overhead of using a subclass, because of the
> attribute lookups that occur on all array creations and because the
> subclass is written in Python that could be on the order of 10%.
>
> Quantifying the subclass slow-down would be useful... Also, which
> BLAS
> are you using?
I am using matlab 6.5.0 release 13 in a Athlon tbird 1.2Ghz
Blas with numpy is ATLAS optimized for 3dnow.
I m getting very interesting results. First the time for a QR
factorization of a 1000x1000 random matrix:
Matlab:
>> t = cputime; R = houseqr(A); cputime - t
ans =
67.1000
Numpy:
In [9]:i = time.clock(); R = houseqr.houseqr(A); time.clock() - i
Out[9]:79.009999999999991
If I code numpy naively making an extra matrix slice this drops to:
In [14]:i = time.clock(); R = houseqr.houseqr(A); time.clock() - i
Out[14]:114.34999999999999
(See the code below).
I have then decided to campare the blas and here things get funny:
Matrix multiplication:
Matlab:
>> t = cputime; A*A; cputime - t
ans =
2.0600
Numpy:
In [32]:i = time.clock(); C=A*A; time.clock() - i
Out[32]:1.3600000000000136
It looks like numpy (ATLAS) is much better....
However in the QR code the most important part is a matrix times vector
multiplication and an outer product. If I benchmark this operations I
get:
Matlab:
>> t = cputime; bench(A,b); cputime - t
ans =
5.7500
Numpy:
In [27]:i = time.clock(); bench(A,b); time.clock() - i
Out[27]:10.610000000000014
Why is numpy so slow??????
Paulo
code
--- houseqr.py ---
from numpy import *
def norm(x):
return sqrt(x.T*x).item()
def sinal(x):
if x < 0.0: return -1.0
else: return 1.0
def houseqr(A):
m, n = A.shape
R = A.copy()
bigE1 = matrix(zeros((m, 1), float))
bigE1[0] = 1.0
for j in range(n):
x = R[j:,j]
v = sinal(x[0])*norm(x)*bigE1[:m-j] + x
v = v / norm(v)
# Slower version.
#R[j:,j:] -= 2.0*v*(v.T*R[j:,j:])
# Faster version: avoids the extra slicing.
upRight = R[j:,j:]
upRight -= 2.0*v*(v.T*upRight)
return R
--- houseqr.m ---
function [A] = houseqr(A)
[m, n] = size(A);
bigE1 = zeros(m,1);
bigE1(1) = 1.0;
for j = 1:n,
x = A(j:m,j);
v = sinal(x(1))*norm(x)*bigE1(1:m-j+1) + x;
v = v / (norm(v));
upRight = A(j:m,j:n);
A(j:m,j:n) = upRight - 2*v*(v'*upRight);
end
--- bench.py ---
from numpy import *
def bench(A, b):
for i in range(100):
c = b.T*A
d = b*c
--- bench.m ---
function bench(A, b)
for i=1:100,
c = b'*A;
d = b*c;
end
|
|
From: Christopher B. <Chr...@no...> - 2006-01-13 19:14:27
|
Travis Oliphant wrote:
> In order to install new and old scipy you will definitely need to
> install one of them to another location besides site-packages (probably
> new scipy).
> Then you will need to make sure your sys.path is set up properly to find
> the scipy you are interested in using for that session.
It sounds like SciPy could use a versioning scheme much like wxPythons:
import wxversion
wxversion.select("2.6")
import wx
See:
http://wiki.wxpython.org/index.cgi/MultiVersionInstalls
In fact, you could probably just grab the wxPython code and tweek it a
little for SciPy.
This was debated a lot on the wxPython lists before being implemented.
After all you could "Just" write a few start-up scripts that manipulate
PYTHONPATH, or re-name some directories, or put in a few sym links,
or, or, or... Also, ideally wxPython major versions are compatible, etc,
etc.
However, when all was said and Dunn, this is a very nice system that
works the same way on all platforms, and doesn't get in the way of
anything. New versions get installed as the default, so if you never use
wxversion, you never know it's there. If you do, then you can test new
versions without having to break any old, running utilities, etc.
Also, the infrastructure is in place for future major version changes, etc.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chr...@no...
|
|
From: Alan I. <ai...@am...> - 2006-01-13 19:08:28
|
On Fri, 13 Jan 2006, Travis Oliphant wrote: > 3) Return scalars instead of 1x1 matrices inside of __array_finalize__ > (where the magic of ensuring matrices are rank-2 arrays is actually done). Just for multiplication, or also for addition etc? Alan Isaac |
|
From: Russel H. <ru...@ap...> - 2006-01-13 19:07:41
|
In the session below, I expected the for loop and the index array to
have the same behavior. Is this behavior by design? Is there some
other way to get the behavior of the for loop? The loop is too slow
for my application ( len(ar1) == 18000).
Russel
Python 2.4.2 (#1, Nov 29 2005, 08:43:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from numarray import *
>>> import numarray.random_array as ra
>>> print libnumarray.__version__
1.5.0
>>> ar1=ra.random(10)
>>> ar2=zeros(5, type=Float32)
>>> ind=array([0,0,1,1,2,2,3,3,4,4])
>>> ar2[ind]+=ar1
>>> ar2
array([ 0.09791247, 0.26159889, 0.89386773, 0.32572687,
0.86001897], type=Float32)
>>> ar1
array([ 0.49895534, 0.09791247, 0.424059 , 0.26159889, 0.29791802,
0.89386773, 0.44290054, 0.32572687, 0.53337622,
0.86001897])
>>> ar2*=0.0
>>> for x in xrange(len(ind)):
... ar2[ind[x]]+=ar1[x]
...
>>> ar2
array([ 0.5968678 , 0.68565786, 1.19178581, 0.76862741,
1.39339519], type=Float32)
>>>
|
|
From: Alan I. <ai...@am...> - 2006-01-13 19:07:16
|
On Fri, 13 Jan 2006, "Paulo J. S. Silva" wrote:
> as dot already makes a scalar out the multiplication of
> two rank-1 arrays (in which case it computes the inner
> product), I thought that this behavior could be extended
> to matrix objects.
Well, 'dot' for matrices is just a synonym for
'matrixmultiply', which it cannot be (in the same sense) for
arrays. But I grant that I find it odd to enforce
conformability for multiplication and not for addition.
(See below.) I will also grant that GAUSS and Matlab
behave as you wish, which might reflect a natural
convenience and might reflect their impoverished types.
Finally, I grant that I have not been able to quickly think
up a use case where I want a 1x1 matrix for anything except
error checking.
Just to be clear, you do not want to get rid of 1x1
matrices, you just want to get rid of them as the result of
*multiplication*, right. So [[1]]*[[2]]=2 but
[[1]]+[[2]]=[[3]]. Right?
And you would, I presume, find a 'scalar' function to
be too clumsy.
Cheers,
Alan Isaac
PS Conformability details:
>>> t
matrix([[2]])
>>> z
matrix([[0, 0, 0],
[0, 1, 2],
[0, 2, 4]])
>>> t+z
matrix([[2, 2, 2],
[2, 3, 4],
[2, 4, 6]])
>>> t*z
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python24\Lib\site-packages\numpy\core\defmatrix.py", line 128, in __m
ul__
return N.dot(self, other)
ValueError: objects are not aligned
>>> t.A*z.A
array([[0, 0, 0],
[0, 2, 4],
[0, 4, 8]])
>>>
|
|
From: Travis O. <oli...@ee...> - 2006-01-13 17:37:45
|
Sebastian Haase wrote: >Hi, >Following up: There was never any response to Francesc proposal ! >I thought it sounded pretty good - as he argued: Still a good (late but >acceptable) time to clean things up ! >(I like just the fact that it removes the "ugly" doubling of having two: >arr.dtype and arr.dtypecode ) > > I think this proposal came during busy times and was not able to be looked at seriously. Times are still busy and so it is difficult to know what to do wih it. I think there is validity to what he is saying. The dtypedescr was only added in December while the dtype was there in March, so the reason for it is historical. I would not mind changing it so that .dtype actually returned the type-descriptor object. This would actually make things easier. It's only historical that it's not that way. One issue is that .dtypechar is a simple replacement for .typecode() but .dtype.char would involve two attribute lookups which may not be a good thing. But, this might not be a big deal because they should probably be using .dtype anyway. >Is this still on the table !? > > I'm willing to look at it, especially since I like the concept of the dtypedescr much better. >>In my struggle for getting consistent behaviours with data types, I've >>ended with a new proposal for treating them. The basic thing is that I >>suggest to deprecate .dtype as being a first-class attribute and >>replace it instead by the descriptor type container, which I find >>quite more useful for end users. >> I think this is true... I was just nervous to change it. But, prior to a 1.0 release I think we still could, if we do it quickly... >>The current .dtype type will be still >>accessible (mainly for developers) but buried in .dtype.type. >> >>Briefly stated: >> >>current proposed >>======= ======== >>.dtypedescr --> moved into .dtype >>.dtype --> moved into .dtype.type >>.dtype.dtypestr --> moved into .dtype.str >> new .dtype.name >> >> >> I actually like this proposal a lot as I think it gives proper place to the data-type descriptors. I say we do it, very soon, and put out another release quickly. -Travis |
|
From: Sebastian H. <ha...@ms...> - 2006-01-13 17:19:45
|
Hi,
Following up: There was never any response to Francesc proposal !
I thought it sounded pretty good - as he argued: Still a good (late but
acceptable) time to clean things up !
(I like just the fact that it removes the "ugly" doubling of having two:
arr.dtype and arr.dtypecode )
Is this still on the table !?
- Sebastian Haase
On Thursday 05 January 2006 13:40, Francesc Altet wrote:
> Hi,
>
> In my struggle for getting consistent behaviours with data types, I've
> ended with a new proposal for treating them. The basic thing is that I
> suggest to deprecate .dtype as being a first-class attribute and
> replace it instead by the descriptor type container, which I find
> quite more useful for end users. The current .dtype type will be still
> accessible (mainly for developers) but buried in .dtype.type.
>
> Briefly stated:
>
> current proposed
> ======= ========
> .dtypedescr --> moved into .dtype
> .dtype --> moved into .dtype.type
> .dtype.dtypestr --> moved into .dtype.str
> new .dtype.name
>
> What is achieved with that? Well, not much, except easy of use and
>
> type comparison correctness. For example, with the next setup:
> >>> import numpy
> >>> a=numpy.arange(10,dtype='i')
> >>> b=numpy.arange(10,dtype='l')
>
> we have currently:
> >>> a.dtype
>
> <type 'int32_arrtype'>
>
> >>> a.dtypedescr
>
> dtypedescr('<i4')
>
> >>> a.dtypedescr.dtypestr
>
> '<i4'
>
> >>> a.dtype.__name__[:-8]
>
> 'int32'
>
> >>> a.dtype == b.dtype
>
> False
>
> With the new proposal, we would have:
> >>> a.dtype.type
>
> <type 'int32_arrtype'>
>
> >>> a.dtype
>
> dtype('<i4')
>
> >>> a.dtype.str
>
> '<i4'
>
> >>> a.dtype.name
>
> 'int32'
>
> >>> a.dtype == b.dtype
>
> True
>
> The advantages of the new proposal are:
>
> - No more .dtype and .dtypedescr lying together, just current
> .dtypedescr renamed to .dtype. I think that current .dtype does not
> provide more useful information than current .dtypedesc, and giving
> it a shorter name than .dtypedescr seems to indicate that it is more
> useful to users (and in my opinion, it isn't).
>
> - Current .dtype is still accessible, but specifying and extra name in
> path: .dtype.type (can be changed into .dtype.type_ or
> whatever). This should be useful mainly for developers.
>
> - Added a useful dtype(descr).name so that one can quickly access to
> the type name.
>
> - Comparison between data types works as it should now (without having
> to create a metaclass for PyType_Type).
>
> Drawbacks:
>
> - Backward incompatible change. However, provided the advantages are
> desirable, I think it is better changing now than later.
>
> - I don't specially like the string representation for the new .dtype
> class. For example, I'd find dtype('Int32') much better than
> dtype('<i4'). However, this would represent more changes in the
> code, but they can be made later on (much less disruptive than the
> proposed change).
>
> - Some other issues that I'm not aware of.
>
>
> I'm attaching the patch for latest SVN. Once applied (please, pay
> attention to the "XXX" signs in patched code), it passes all tests.
> However, it may remain some gotchas (specially those cases that are
> not checked in current tests). In case you are considering this change
> to check in, please, tell me and I will revise much more carefully the
> patch. If don't, never mind, it has been a good learning experience
> anyway.
>
> Uh, sorry for proposing this sort of things in the hours previous to a
> public release of numpy.
|
|
From: Travis O. <oli...@ee...> - 2006-01-13 17:09:11
|
Paulo J. S. Silva wrote: >Obs: Actually I found the "problem" when implementing a QR decomposition >based on Householder reflections and comparing it to a Matlab code. >Numpy is only 10% slower than Matlab in this code. Man, I'll love to >give my numerical linear algebra course this year using Python instead >of Matlab/Octave. > > I would be interested to see how a raw-array solution compares. There is going to be some overhead of using a subclass, because of the attribute lookups that occur on all array creations and because the subclass is written in Python that could be on the order of 10%. Quantifying the subclass slow-down would be useful... Also, which BLAS are you using? -Travis |
|
From: Travis O. <oli...@ee...> - 2006-01-13 17:06:22
|
Paulo J. S. Silva wrote: >Ops... You are right. The example is not good. Look this though (now I >am actually copying my ipython session, instead of "remembering" it): > >--- Session copy here --- > >In [5]:x = matrix(arange(10.)).T >In [6]:x = matrix(arange(3.)).T >In [7]:A = matrix([[1.,2,3],[4,5,6],[7,8,9]]) >In [8]:b = x.T*x*A >--------------------------------------------------------------------------- >exceptions.ValueError Traceback (most >recent call last) > >/home/pjssilva/<console> > >/usr/local/lib/python2.4/site-packages/numpy/core/defmatrix.py in >__mul__(self, other) > 126 return N.multiply(self, other) > 127 else: >--> 128 return N.dot(self, other) > 129 > 130 def __rmul__(self, other): > >ValueError: matrices are not aligned > >--- End of copy --- > >You see, the inner product can not be used to mutiply by a matrix, which >is very odd in linear algebra. As the matrix class is supposed to >represent the linear algebra object we I see two options: > >1) Change the __mul__, __rmul__, __imul__ to deal with 1x1 matrices as >scalars. > >2) Change dot to convert 1x1 matrix to scalar at return. > > or 3) Return scalars instead of 1x1 matrices inside of __array_finalize__ (where the magic of ensuring matrices are rank-2 arrays is actually done). |