numpy-discussion Mailing List for Numerical Python (Page 145)

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion — Discussion list for all users of Numerical Python

You can subscribe to this list here.

2000	Jan (8)	Feb (49)	Mar (48)	Apr (28)	May (37)	Jun (28)	Jul (16)	Aug (16)	Sep (44)	Oct (61)	Nov (31)	Dec (24)
2001	Jan (56)	Feb (54)	Mar (41)	Apr (71)	May (48)	Jun (32)	Jul (53)	Aug (91)	Sep (56)	Oct (33)	Nov (81)	Dec (54)
2002	Jan (72)	Feb (37)	Mar (126)	Apr (62)	May (34)	Jun (124)	Jul (36)	Aug (34)	Sep (60)	Oct (37)	Nov (23)	Dec (104)
2003	Jan (110)	Feb (73)	Mar (42)	Apr (8)	May (76)	Jun (14)	Jul (52)	Aug (26)	Sep (108)	Oct (82)	Nov (89)	Dec (94)
2004	Jan (117)	Feb (86)	Mar (75)	Apr (55)	May (75)	Jun (160)	Jul (152)	Aug (86)	Sep (75)	Oct (134)	Nov (62)	Dec (60)
2005	Jan (187)	Feb (318)	Mar (296)	Apr (205)	May (84)	Jun (63)	Jul (122)	Aug (59)	Sep (66)	Oct (148)	Nov (120)	Dec (70)
2006	Jan (460)	Feb (683)	Mar (589)	Apr (559)	May (445)	Jun (712)	Jul (815)	Aug (663)	Sep (559)	Oct (930)	Nov (373)	Dec

Flat | Threaded

<< < 1 .. 143 144 145 146 147 .. 480 > >> (Page 145 of 480)

Re: [Numpy-discussion] Distance Matrix speed

From: Sebastian B. <seb...@gm...> - 2006-06-17 03:06:35

Thanks! Avoiding the inner loop is MUCH faster (~20-300 times than the
original). Nevertheless I don't think I can use hypot as it only works
for two dimensions. The general problem I have is:

A = random( [C, K] )
B = random( [N, K] )

C ~ 1-10
N ~ Large (thousands, millions.. i.e. my dataset)
K ~ 2-100 (dimensions of my problem, i.e. not fixed a priori.)

I adapted your proposed version to this for K dimensions:

def d4():
    d = zeros([4, 1000], dtype=float)
    for i in range(4):
        xy = A[i] - B
        d[i] = sqrt( sum(xy**2, axis=1) )
    return d

Maybe there's another alternative to d4?
Thanks again,

Sebastian.

>     def d_2():
>         d = zeros([4, 10000], dtype=float)
>         for i in range(4):
>             xy = A[i] - B
>             d[i] = xy[:,0]**2 + xy[:,1]**2
>         return d
>
> This is something like 250 times as fast as the naive Python solution;
> another five times faster than the fastest distance computing version
> that I could come up with (using hypot).
>
> -tim
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

Re: [Numpy-discussion] Array interface updated to Version 3

From: Andrew S. <str...@as...> - 2006-06-17 01:11:28

I noticed in your note labeled 'June 16, 2006' that you refer to the 
"desc" field. However, in the struct description above, there is only a 
field named "descr".

Also, I suggest that you update the information in the comments of descr 
field of the structure description to contain the fact that inter.descr 
is a reference to a tuple equal to ("PyArrayInterface Version 
#",new_tuple_with_array_interface). What is currently there seems out of 
date given the new information. Finally, in the comment  section 
describing this field, I strongly suggesting noting that this field is 
only present *if and only if* the ARR_HAS_DESCR flag is present. It will 
be more clear if it's there rather than in the text underneath.

Is the "#" in the string meant to be replaced with "3"? If so, why not 
write 3? Also, in your note, you should explain whether "dummy" (renamed 
from "version") should still be checked as a sanity check or whether it 
should now be ignored. I think we could call the field "two" and keep 
the sanity check for backwards compatibility. I agree it is confusing to 
have two different version numbers in the same struct, so I don't mind 
having the official name of the field being something other than 
"version", but if we keep it as a required sanity check (in which case 
it probably shouldn't be named "dummy"), the whole thing will remain 
backwards compatible with all current code.

Anyhow, I'm very excited about this array interface, and I await the 
outcome of the Summer of Code project on the 'micro-array' 
implementation based on it!

Cheers!
Andrew

Travis Oliphant wrote:

>I just updated the array interface page to emphasize we now have version 
>3.  NumPy still supports  objects that expose (the C-side) of version 2 
>of the array interface, though. 
>
>The new interface is basically the same except (mostly) for asthetics:  
>The differences are listed at the bottom of
>
>http://numeric.scipy.org/array_interface.html
>
>There is talk of ctypes supporting the new interface which is a worthy 
>development.  Please encourage that if you can.
>
>Please voice concerns now if you have any.
>
>-Travis
>
>
>
>_______________________________________________
>Numpy-discussion mailing list
>Num...@li...
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>

Re: [Numpy-discussion] Array interface updated to Version 3

From: Fernando P. <fpe...@gm...> - 2006-06-16 23:54:26

On 6/16/06, Travis Oliphant <oli...@ie...> wrote:

> There is talk of ctypes supporting the new interface which is a worthy
> development.  Please encourage that if you can.

That would certainly be excellent, esp. given how ctypes is slated to
be officially part of python 2.5.  I think it would greatly improve
the interoperability landscape for python if the out-of-the-box
toolset had proper access to numpy arrays.

Cheers,

f

[Numpy-discussion] Array interface updated to Version 3

From: Travis O. <oli...@ie...> - 2006-06-16 23:46:44

I just updated the array interface page to emphasize we now have version 
3.  NumPy still supports  objects that expose (the C-side) of version 2 
of the array interface, though. 

The new interface is basically the same except (mostly) for asthetics:  
The differences are listed at the bottom of

http://numeric.scipy.org/array_interface.html

There is talk of ctypes supporting the new interface which is a worthy 
development.  Please encourage that if you can.

Please voice concerns now if you have any.

-Travis

[Numpy-discussion] installing numpy and removing numeric-24.1

From: Jon C. <jc...@ke...> - 2006-06-16 22:38:00

=20

Sorry, I forgot to mention that I'm working on a Solaris system and
installed it in /usr/local/gcc3xbuilt instead of /usr/local.

=20

Thanks.

=20

JC

[Numpy-discussion] installing numpy and removing numeric-24.

From: Jon C. <jc...@ke...> - 2006-06-16 22:36:06

=20

Hi folks!



=20

I'd like to install numpy and remove numeric, are there instructions to
remove numeric-24.1?

=20

Thanks.

=20

JC

Re: [Numpy-discussion] Recarray attributes writeable

From: Erin S. <eri...@gm...> - 2006-06-16 22:18:25

The initial bounces actually say, and I quote:

Technical details of temporary failure:
TEMP_FAILURE: SMTP Error (state 8): 550-"rejected because your SMTP
server, 66.249.92.170, is in the Spamcop RBL.
550 See http://www.spamcop.net/bl.shtml for more information."

On 6/16/06, Robert Kern <rob...@gm...> wrote:
> Erin Sheldon wrote:
> > Hi everyone -
> >
> > (this is my fourth try in the last 24 hours to post this.
> > Apparently, the gmail smtp server is in the blacklist!!
> > this is bad).
>
> I doubt it since that's where my email goes through. Sourceforge is frequently
> slow, so please have patience if your mail does not show up. I can see your 3rd
> try now. Possibly the others will be showing up, too.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

Re: [Numpy-discussion] Recarray attributes writeable

From: Andrew S. <str...@as...> - 2006-06-16 21:47:33

Erin Sheldon wrote:

>Anyway - Recarrays have convenience attributes such that
>fields may be accessed through "." in additioin to
>the "field()" method.  These attributes are designed for
>read only; one cannot alter the data through them.
>Yet they are writeable:
>
>  
>
>>>>tr=numpy.recarray(10, formats='i4,f8,f8', names='id,ra,dec')
>>>>tr.field('ra')[:] = 0.0
>>>>tr.ra
>>>>        
>>>>
>array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
>
>  
>
>>>>tr.ra = 3
>>>>tr.ra
>>>>        
>>>>
>3
>  
>
>>>>tr.field('ra')
>>>>        
>>>>
>array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
>
>I feel this should raise an exception, just as with trying to write
>to the "size" attribute. Any thoughts?
>  
>
I have not used recarrays much, so take this with the appropriate 
measure of salt.

I'd prefer to drop the entire pseudo-attribute thing completely before 
it gets entrenched. (Perhaps it's too late.)

I've used a similar system in pytables which, although it is convenient 
in the short term and for interactive use, there are corner cases that 
result in long term headaches. I think you point out one such issue for 
recarrays. There will be more. For example:

In [1]:import numpy

In [2]:tr=numpy.recarray(10, formats='i4,f8,f8', names='id,ra,dec')

In [3]:tr.field('ra')[:] = 0.0

In [4]:tr.ra
Out[4]:array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

In [5]:del tr.ra
---------------------------------------------------------------------------
exceptions.AttributeError                            Traceback (most 
recent call last)

/home/astraw/<console>

AttributeError: 'recarray' object has no attribute 'ra'

The above seems completely counterintuitive -- an attribute error for 
something I just accessed? Yes, I know what's going on, but it certainly 
makes life more confusing than it need be, IMO.

Another issue is that it is possible to have field names that are not 
valid Python identifier strings.

Re: [Numpy-discussion] Array Protocol change for Python 2.6

From: Travis O. <oli...@ie...> - 2006-06-16 21:44:38

Thomas Heller wrote:
> Robert Kern wrote:
>   
>> Francesc Altet wrote:
>>     
>>> A Divendres 09 Juny 2006 11:54, Albert Strasheim va escriure:
>>>
>>>       
>>>> Just out of curiosity:
>>>>
>>>> In [1]: x = N.array([])
>>>>
>>>> In [2]: x.__array_data__
>>>> Out[2]: ('0x01C23EE0', False)
>>>>
>>>> Is there a reason why the __array_data__ tuple stores the address as a hex
>>>> string? I would guess that this representation of the address isn't the
>>>> most useful one for most applications.
>>>>         
>>> Good point. I hit this before and forgot to send a message about this. I agree 
>>> that a integer would be better. Although, now that I think about this, I 
>>> suppose that the issue should be the difference of representation of longs in 
>>> 32-bit and 64-bit platforms, isn't it?
>>>       
>> Like how Win64 uses 32-bit longs and 64-bit pointers. And then there's
>> signedness. Please don't use Python ints to encode pointers. Holding arbitrary
>> pointers is the job of CObjects.
>>
>>     
>
> (Sorry, I'm late in reading this thread.  I didn't know there were so many
> numeric groups)
>
> Python has functions to convert pointers to int/long and vice versa:  PyInt_FromVoidPtr()
> and PyInt_AsVoidPtr().  ctypes uses them, ctypes also represents addresses as ints/longs.
>   

The function calls are PyLong_FromVoidPtr() and PyLong_AsVoidPtr() 
though, right?     I'm happy representing pointers as Python integers 
(Python long integers on curious platforms like Win64).


-Travis

[Numpy-discussion] Sourceforge and gmail [was: Re: Recarray attributes writeable]

From: Robert K. <rob...@gm...> - 2006-06-16 21:43:12

Robert Kern wrote:
> Erin Sheldon wrote:
> 
>>Hi everyone -
>>
>>(this is my fourth try in the last 24 hours to post this.
>>Apparently, the gmail smtp server is in the blacklist!!
>>this is bad).
> 
> I doubt it since that's where my email goes through.

And of course that's utterly bogus since I usually use GMane. Apologies.

However, *this* is a real email to numpy-discussion.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

Re: [Numpy-discussion] Recarray attributes writeable

From: Robert K. <rob...@gm...> - 2006-06-16 21:33:55

Erin Sheldon wrote:
> Hi everyone -
> 
> (this is my fourth try in the last 24 hours to post this.
> Apparently, the gmail smtp server is in the blacklist!!
> this is bad).

I doubt it since that's where my email goes through. Sourceforge is frequently
slow, so please have patience if your mail does not show up. I can see your 3rd
try now. Possibly the others will be showing up, too.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

[Numpy-discussion] Recarray attributes writeable (3rd try)

From: Erin S. <eri...@gm...> - 2006-06-16 21:25:20

Hi everyone -

(this is my third try in the last 24 hours to post this.
For some reason it hasn't been making it through)

Recarrays have convenience attributes such that
fields may be accessed through "." in additioin to
the "field()" method.  These attributes are designed for
read only; one cannot alter the data through them.
Yet they are writeable:

>>> tr=numpy.recarray(10, formats='i4,f8,f8', names='id,ra,dec')
>>> tr.field('ra')[:] = 0.0
>>> tr.ra
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

>>> tr.ra = 3
>>> tr.ra
3
>>> tr.field('ra')
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

I feel this should raise an exception, just as with trying to write
to the "size" attribute. Any thoughts?

Erin

[Numpy-discussion] Recarray attributes writeable

From: Erin S. <esh...@ki...> - 2006-06-16 21:12:37

Hi everyone -

(this is my fourth try in the last 24 hours to post this.
Apparently, the gmail smtp server is in the blacklist!!
this is bad).

Anyway - Recarrays have convenience attributes such that
fields may be accessed through "." in additioin to
the "field()" method.  These attributes are designed for
read only; one cannot alter the data through them.
Yet they are writeable:

>>> tr=numpy.recarray(10, formats='i4,f8,f8', names='id,ra,dec')
>>> tr.field('ra')[:] = 0.0
>>> tr.ra
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

>>> tr.ra = 3
>>> tr.ra
3
>>> tr.field('ra')
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

I feel this should raise an exception, just as with trying to write
to the "size" attribute. Any thoughts?

Erin

Re: [Numpy-discussion] Array Interface

From: Thomas H. <th...@py...> - 2006-06-16 19:50:39

Travis Oliphant wrote:
> Thanks for the continuing discussion on the array interface.
> 
> I'm thinking about this right now, because I just spent several hours 
> trying to figure out if it is possible to add additional 
> "object-behavior" pointers to a type by creating a metatype that 
> sub-types from the Python PyType_Type (this is the object that has all 
> the function pointers to implement mapping behavior, buffer behavior, 
> etc.).    I found some emails from 2002 where Guido indicates that it is 
> not possible to sub-type the PyType_Type object and add new function 
> pointers at the end without major re-writing of Python.

Yes, but I remember an email from Christian Tismer that it *is* possible.
Although I've never tried that.

What I do in ctypes is to replace the type objects (the subclass of PyType_Type)
dictionary with a subclass of PyDict_Type (in ctypes it is named StgDictObject
- storage dict object, a very poor name I know) that has additional structure fields
describing the C data type it represents.

Thomas

Re: [Numpy-discussion] Array Protocol change for Python 2.6

From: Francesc A. <fa...@ca...> - 2006-06-16 19:36:24

A Divendres 16 Juny 2006 21:25, Thomas Heller va escriure:
> Robert Kern wrote:
> > Like how Win64 uses 32-bit longs and 64-bit pointers. And then there's
> > signedness. Please don't use Python ints to encode pointers. Holding
> > arbitrary pointers is the job of CObjects.
>
> (Sorry, I'm late in reading this thread.  I didn't know there were so many
> numeric groups)
>
> Python has functions to convert pointers to int/long and vice versa:=20
> PyInt_FromVoidPtr() and PyInt_AsVoidPtr().  ctypes uses them, ctypes also
> represents addresses as ints/longs.

Very interesting. So, may I suggest to use this capability to represent=20
addresses? I think this would simplify things (specially it will prevent to=
=20
use ascii/pointer conversions, which are ugly to my mind).

Cheers,

=2D-=20
>0,0<   Francesc Altet =A0 =A0 http://www.carabos.com/
V   V   C=E1rabos Coop. V. =A0=A0Enjoy Data
 "-"

Re: [Numpy-discussion] Array Protocol change for Python 2.6

From: Thomas H. <th...@py...> - 2006-06-16 19:27:20

Robert Kern wrote:
> Francesc Altet wrote:
>> A Divendres 09 Juny 2006 11:54, Albert Strasheim va escriure:
>> 
>>>Just out of curiosity:
>>>
>>>In [1]: x = N.array([])
>>>
>>>In [2]: x.__array_data__
>>>Out[2]: ('0x01C23EE0', False)
>>>
>>>Is there a reason why the __array_data__ tuple stores the address as a hex
>>>string? I would guess that this representation of the address isn't the
>>>most useful one for most applications.
>> 
>> Good point. I hit this before and forgot to send a message about this. I agree 
>> that a integer would be better. Although, now that I think about this, I 
>> suppose that the issue should be the difference of representation of longs in 
>> 32-bit and 64-bit platforms, isn't it?
> 
> Like how Win64 uses 32-bit longs and 64-bit pointers. And then there's
> signedness. Please don't use Python ints to encode pointers. Holding arbitrary
> pointers is the job of CObjects.
> 

(Sorry, I'm late in reading this thread.  I didn't know there were so many
numeric groups)

Python has functions to convert pointers to int/long and vice versa:  PyInt_FromVoidPtr()
and PyInt_AsVoidPtr().  ctypes uses them, ctypes also represents addresses as ints/longs.

Thomas

Re: [Numpy-discussion] Segfault with simplest operation on extensionmodule using numpy

From: Albert S. <fu...@gm...> - 2006-06-16 18:04:44

Hey Glen

http://www.scipy.org/Cookbook/C_Extensions

covers most of the boilerplate you need to get started with extension
modules.

Regards,

Albert

> -----Original Message-----
> From: num...@li... [mailto:numpy-
> dis...@li...] On Behalf Of Glen W. Mabey
> Sent: 16 June 2006 18:24
> To: num...@li...
> Subject: [Numpy-discussion] Segfault with simplest operation on
> extensionmodule using numpy
> 
> Hello,
> 
> I am writing a python extension module to create an interface to some C
> code, and am using numpy array as the object type for transferring data
> back and forth.

[Numpy-discussion] tiny patch + Playing with strings and my own array descr (PyArray_STRING, PyArray_OBJECT).

From: Matthieu P. <pe...@sh...> - 2006-06-16 18:01:53

hi,

I need to handle strings shaped by a numpy array whose data own to a C
structure. There is several possible answers to this problem :
  1) use a numpy array of strings (PyArray_STRING) and so a (char *) object
  in C. It works as is, but you need to define a maximum size to your strin=
gs
  because your set of strings is contiguous in memory.
  2) use a numpy array of objects (PyArray_OBJECT), and wrap each =ABC stri=
ng=BB
  with a python object, using PyStringObject for example. Then our problem =
is
  that there is as wrapper as data element and I believe data can't be shar=
ed
  when your created PyStringObject using (char *) thanks to
  PyString_AsStringAndSize by example.


Now, I will expose a third way, which allow you to use no size-limited stri=
ngs
(as in solution 1.) and don't create wrappers before you really need it
(on demand/access).

=46irst, for convenience, we will use in C, (char **) type to build an arra=
y of
string pointers (as it was suggested in solution 2). Now, the game is to
make it works with numpy API, and use it in python through a python array.
Basically, I want a very similar behabiour than arrays of PyObject, where
data are not contiguous, only their address are. So, the idea is to create
a new array descr based on PyArray_OBJECT and change its getitem/setitem
functions to deals with my own data.

I exepected numpy to work with this convenient array descr, but it fails
because PyArray_Scalar (arrayobject.c) don't call descriptor getitem functi=
on
(in PyArray_OBJECT case) but call 2 lines which have been copy/paste from
the OBJECT_getitem function). Here my small patch is :
replace (arrayobject.c:983-984):
          Py_INCREF(*((PyObject **)data));
          return *((PyObject **)data);
by :
          return descr->f->getitem(data, base);

I play a lot with my new numpy array after this change and noticed that a l=
ot
of uses works :
>>> a =3D myArray()
array([["plop", "blups"]], dtype=3Dobject)
>>> print a
[["plop", "blups"]]
>>> a[0, 0] =3D "youpiiii"
>>> print a
[["youpiiii", "blups"]]
s =3D a[0, 0]
>>> print s
"youpiiii"
>>> b =3D a[:] #data was shared with 'a' (similar behaviour than array of=20
objects)
>>> >>> numpy.zeros(1, dtype =3D a.dtype)=20
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: fields with object members not yet supported.
>>> numpy.array(a)
segmentation fault

=46inally, I found a forgotten check in multiarraymodule.c (_array_fromobje=
ct
function), after label finish (line 4661), add :
        if (!ret) {
                Py_INCREF(Py_None);
                return Py_None;
        }

After this change, I obtained (when I was not in interactive mode) :
# numpy.array(a)
Exception exceptions.TypeError: 'fields with object members not yet=20
supported.' in 'garbage collection' ignored
=46atal Python error: unexpected exception during garbage collection
Abandon

But strangely, when I was in interactive mode, one time it fails and raise =
an
exception (good behaviour), and the next time it only returns None.
>>> numpy.array(myArray())
TypeError: fields with object members not yet supported.
>>> a=3Dnumpy.array(myArray()); print a
None

A bug remains (I will explore it later), but it is better than before.


This mail, show how to map (char **) on a numpy array, but it's easy to use
the same idea to handle any types (your_object **).

I'll be pleased to discuss on any comments on the proposed solution or any
others you can find.

=2D-
Matthieu Perrot             Tel: +33 1 69 86 78 21
CEA - SHFJ                  Fax: +33 1 69 86 77 86
4, place du General Leclerc
91401 Orsay Cedex France

Re: [Numpy-discussion] Distance Matrix speed

From: Tim H. <tim...@co...> - 2006-06-16 17:49:07

Christopher Barker wrote:

>Bruce Southey wrote:
>  
>
>>Please run the exact same code in Matlab that you are running in
>>NumPy. Many of Matlab functions are very highly optimized so these are
>>provided as binary functions. I think that you are running into this
>>so you are not doing the correct comparison
>>    
>>
>
>He is doing the correct comparison: if Matlab has some built-in compiled 
>utility functions that numpy doesn't -- it really is faster!
>
>It looks like other's suggestions show that well written numpy code is 
>plenty fast, however.
>
>One more suggestion I don't think I've seen: numpy provides a built-in 
>compiled utility function: hypot()
>  
>
> >>> x = N.arange(5)
> >>> y = N.arange(5)
> >>> N.hypot(x,y)
>array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])
> >>> N.sqrt(x**2 + y**2)
>array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])
>
>Timings:
> >>> timeit.Timer('N.sqrt(x**2 + y**2)','import numpy as N; x = 
>N.arange(5000); y = N.arange(5000)').timeit(100)
>0.49785208702087402
> >>> timeit.Timer('N.hypot(x,y)','import numpy as N; x = N.arange(5000); 
>y = N.arange(5000)').timeit(100)
>0.081479072570800781
>
>A factor of 6 improvement.
>  
>
Here's another thing to note: much of the time distance**2 works as well 
as distance (for instance if you are looking for the nearest point). If 
you're in that situation, computing the square of the distance is much 
cheaper:

    def d_2():
        d = zeros([4, 10000], dtype=float)
        for i in range(4):
            xy = A[i] - B
            d[i] = xy[:,0]**2 + xy[:,1]**2
        return d

This is something like 250 times as fast as the naive Python solution; 
another five times faster than the fastest distance computing version 
that I could come up with (using hypot).

-tim

Re: [Numpy-discussion] Distance Matrix speed

From: Christopher B. <Chr...@no...> - 2006-06-16 17:05:42

Bruce Southey wrote:
> Please run the exact same code in Matlab that you are running in
> NumPy. Many of Matlab functions are very highly optimized so these are
> provided as binary functions. I think that you are running into this
> so you are not doing the correct comparison

He is doing the correct comparison: if Matlab has some built-in compiled 
utility functions that numpy doesn't -- it really is faster!

It looks like other's suggestions show that well written numpy code is 
plenty fast, however.

One more suggestion I don't think I've seen: numpy provides a built-in 
compiled utility function: hypot()

 >>> x = N.arange(5)
 >>> y = N.arange(5)
 >>> N.hypot(x,y)
array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])
 >>> N.sqrt(x**2 + y**2)
array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])

Timings:
 >>> timeit.Timer('N.sqrt(x**2 + y**2)','import numpy as N; x = 
N.arange(5000); y = N.arange(5000)').timeit(100)
0.49785208702087402
 >>> timeit.Timer('N.hypot(x,y)','import numpy as N; x = N.arange(5000); 
y = N.arange(5000)').timeit(100)
0.081479072570800781

A factor of 6 improvement.

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chr...@no...

Re: [Numpy-discussion] Segfault with simplest operation on extension module using numpy

From: Robert K. <rob...@gm...> - 2006-06-16 16:45:36

Glen W. Mabey wrote:

> That is, when I run:
>     import DFALG
>     DFALG.bsvmdf( 3 )
> after compiling the below code, it always segfaults, regardless of the
> type of the argument given.  Just as a sanity check (it's been a little
> while since I have written an extension module for Python) I changed the
> line containing PyArray_Check() to one that calls PyInt_Check(), which
> does perform exactly how I would expect it to.
> 
> Is there something I'm missing?

Yes!

> #include <Python.h>
> #include <arrayobject.h>

This should be "numpy/arrayobject.h" for consistency with every other
numpy-using extension.

> PyMODINIT_FUNC
> initDFALG(void)
> {
> 	(void) Py_InitModule("DFALG", DFALGMethods);
> }

You need to call import_array() in this function.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

[Numpy-discussion] Segfault with simplest operation on extension module using numpy

From: Glen W. M. <Gle...@sw...> - 2006-06-16 16:24:00

Hello,

I am writing a python extension module to create an interface to some C
code, and am using numpy array as the object type for transferring data
back and forth.

Using either the numpy svn from yesterday, or 0.9.6 or 0.9.8, with or 
without optimized ATLAS installation, I get a segfault at what should be
the most straightforward of all operations: PyArray_Check() on the input
argument.

That is, when I run:
    import DFALG
    DFALG.bsvmdf( 3 )
after compiling the below code, it always segfaults, regardless of the
type of the argument given.  Just as a sanity check (it's been a little
while since I have written an extension module for Python) I changed the
line containing PyArray_Check() to one that calls PyInt_Check(), which
does perform exactly how I would expect it to.

Is there something I'm missing?

Thank you!
Glen Mabey






#include <Python.h>
#include <arrayobject.h>

static PyObject *
DFALG_bsvmdf(PyObject *self, PyObject *args);

static PyMethodDef DFALGMethods[] = {
	{"bsvmdf",  DFALG_bsvmdf, METH_VARARGS, "This should be a docstring, really."},
	{NULL, NULL, 0, NULL}        /* Sentinel */
};

PyMODINIT_FUNC
initDFALG(void)
{
	(void) Py_InitModule("DFALG", DFALGMethods);
}

static PyObject *
DFALG_bsvmdf(PyObject *self, PyObject *args)
{
    PyObject *inputarray;
    //printf( "Hello, Python!" );
	//Py_INCREF(Py_None);
	//return Py_None;

    if ( !PyArg_ParseTuple( args, "O", &inputarray ) )
        return NULL;

    if ( PyArray_Check( inputarray ) ) {
    //if ( PyInt_Check( inputarray ) ) {
        printf( "DFALG_bsvmdf() was passed a PyArray.()\n" );
    } else {
        printf( "DFALG_bsvmdf() was NOT passed a PyArray.()\n" );
    }
    
	return Py_BuildValue( "ss", "Thing 1", "Thing 2" );
}

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Tim H. <tim...@co...> - 2006-06-16 16:23:28

Sasha wrote:

>On 6/16/06, Sven Schreiber <sve...@gm...> wrote:
>  
>
>>....
>>Abbreviations will emerge anyway, the question is merely: Will numpy
>>provide/recommend them (in addition to having long names maybe), or will
>>it have to be done by somebody else, possibly resulting in many
>>different sets of abbreviations for the same purpose.
>>
>>    
>>
>This is a valid point.  In my experience ad hoc abbreviations are more
>popular among scientists who are not used to writing large programs.
>They use numpy either interactively or write short throw-away scripts
>that are rarely reused.  Programmers who write reusable code almost
>universally hate ad hoc abbreviations. (There are exceptions:
><http://www.kuro5hin.org/story/2002/8/30/175531/763>.)
>
>If numpy is going to compete with MATLAB, we should not ignore
>non-programmer user base.  I like the idea of providing recommended
>abbreviations.   There is a precedent for doing that: GNU command line
>utilities provide long/short alternatives for most options.  Long
>options are recommended for use in scripts while short are
>indispensable at the command line.
>  
>
Unless the abreviations are obvious, adding second set of names will 
make it more difficult to read others code. In particular, it will make 
it harder to answer questions on the newsgroup. Particularly since I 
suspect that most of the more experienced  users will end up using long 
names while the new users coming from MATLAB or whatever will use the 
shorter names.

>I would like to suggest the following guidelines:
>
>1. Numpy should never invent abbreviations, but may reuse
>abbreviations used in the art.
>  
>
Let me add a couple of cents here. There are widespread terms of the art 
and there are terms of art that are specific to a certain field. At the 
top level, I would like to see only widespread terms of the art. Thus, 
'cos', 'sin', 'exp', etc are perfectly fine. However, something like 
'dft' is not so good. Perversely, I consider 'fft' a widespread term of 
the art, but the more general 'dft' is somehow not.

These narrower terms would be perfectly fine if segregated into 
appropriate packages. For example, I would consider it more sensible to 
have the current package 'dft' renamed to 'fourier' and the routine 
'fft' renamed to 'dft' (since that's what it is).  As another example, 
linear_algebra.svd is perfectly clear, but numpy.svd would be opaque.

>2. When alternative names are made available, there should be one
>simple rule for reducing the long name to short.  For example, use of
>acronyms may provide one such rule: singular_value_decomposition ->
>svd.
>
Svd is already a term of the art I believe, so linalg.svd seems like a 
fine name for singular_value_decomposition.

>  Unfortunately that would mean linear_least_squares -> lls, not
>ols and conflict with rule #1 (rename lstsq ->
>ordinary_least_squares?).
>  
>
Before you consider this I suggest that you google 'linear algebra lls' 
and 'linear algebra ols'. The results may suprise you...

While your at it google 'linear algebra svd'

>The second guideline may be hard to follow, but it is very important.
>Without a rule like this, there will be confusion on whether
>linear_least_squares and lsltsq are the same or just "similar".
>  
>
Can I just reiterate a hearty blech! for having two sets of names. The 
horizontal space argument is mostly bogus in my opinion -- functions 
that tend to be used in complicated expression already have short, 
widely used abbreviations that we can steal. The typing argument is also 
mostly bogus: a decent editor will do tab completion (I use a pretty 
much no frills editor, SciTe, and even it does tab completion)  and 
there's IPython if you want tab completion in interactive mode.

-tim

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Alan G I. <ai...@am...> - 2006-06-16 15:29:47

On Fri, 16 Jun 2006, Sven Schreiber apparently wrote:=20
> Abbreviations will emerge anyway, the question is merely:=20
> Will numpy provide/recommend them (in addition to having=20
> long names maybe), or will it have to be done by somebody=20
> else, possibly resulting in many different sets of=20
> abbreviations for the same purpose.=20

Agreed. =20
Cheers,
Alan Isaac

Re: [Numpy-discussion] Distance Matrix speed

From: Bruce S. <bso...@gm...> - 2006-06-16 14:20:45

Hi,
Please run the exact same code in Matlab that you are running in
NumPy. Many of Matlab functions are very highly optimized so these are
provided as binary functions. I think that you are running into this
so you are not doing the correct comparison

So the ways around it are to write an extension in C or Fortran, use
Pysco etc if possible, and vectorize your algorithm to remove the
loops (especially the inner one).

Bruce

On 6/14/06, Sebastian Beca <seb...@gm...> wrote:
> Hi,
> I'm working with NumPy/SciPy on some algorithms and i've run into some
> important speed differences wrt Matlab 7. I've narrowed the main speed
> problem down to the operation of finding the euclidean distance
> between two matrices that share one dimension rank (dist in Matlab):
>
> Python:
> def dtest():
>     A = random( [4,2])
>     B = random( [1000,2])
>
>     d = zeros([4, 1000], dtype='f')
>     for i in range(4):
>         for j in range(1000):
>             d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) )
>     return d
>
> Matlab:
>     A = rand( [4,2])
>     B = rand( [1000,2])
>     d = dist(A, B')
>
> Running both of these 100 times, I've found the python version to run
> between 10-20 times slower. My question is if there is a faster way to
> do this? Perhaps I'm not using the correct functions/structures? Or
> this is as good as it gets?
>
> Thanks on beforehand,
>
> Sebastian Beca
> Department of Computer Science Engineering
> University of Chile
>
> PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have
> ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

258 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 143 144 145 146 147 .. 480 > >> (Page 145 of 480)