numpy-discussion Mailing List for Numerical Python (Page 282)

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion — Discussion list for all users of Numerical Python

You can subscribe to this list here.

2000	Jan (8)	Feb (49)	Mar (48)	Apr (28)	May (37)	Jun (28)	Jul (16)	Aug (16)	Sep (44)	Oct (61)	Nov (31)	Dec (24)
2001	Jan (56)	Feb (54)	Mar (41)	Apr (71)	May (48)	Jun (32)	Jul (53)	Aug (91)	Sep (56)	Oct (33)	Nov (81)	Dec (54)
2002	Jan (72)	Feb (37)	Mar (126)	Apr (62)	May (34)	Jun (124)	Jul (36)	Aug (34)	Sep (60)	Oct (37)	Nov (23)	Dec (104)
2003	Jan (110)	Feb (73)	Mar (42)	Apr (8)	May (76)	Jun (14)	Jul (52)	Aug (26)	Sep (108)	Oct (82)	Nov (89)	Dec (94)
2004	Jan (117)	Feb (86)	Mar (75)	Apr (55)	May (75)	Jun (160)	Jul (152)	Aug (86)	Sep (75)	Oct (134)	Nov (62)	Dec (60)
2005	Jan (187)	Feb (318)	Mar (296)	Apr (205)	May (84)	Jun (63)	Jul (122)	Aug (59)	Sep (66)	Oct (148)	Nov (120)	Dec (70)
2006	Jan (460)	Feb (683)	Mar (589)	Apr (559)	May (445)	Jun (712)	Jul (815)	Aug (663)	Sep (559)	Oct (930)	Nov (373)	Dec

Flat | Threaded

<< < 1 .. 280 281 282 283 284 .. 480 > >> (Page 282 of 480)

Re: [Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: <sk...@po...> - 2005-09-11 15:25:27

    Joost> is it possible to use compression while storing
    Joost> numarray/Numeric objects?

Try the gzip or bz2 modules.  Both have file-like objects that transparently
(de)compress data as it is read or written.

    Joost> Ideally the compression also works with memmory mapped arrays.

Dunno, but probably not.  You'll have to experiment.

Skip

Re: [Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: Warren F. <fo...@sl...> - 2005-09-09 20:55:36

You may be able to avoid the tostring() overhead by using tofile():

s.tofile(gzip.open('compressed.dat', 'wb'))

You are probably SOL on the mmapping, though.

w

On Fri, 9 Sep 2005, Joost van Evert wrote:

> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
> > >>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
> >
> >     Joost> is it possible to use compression while storing
> >     Joost> numarray/Numeric objects?
> >
> >
> > Sure
> >
> >     In [35]: s = rand(10000)
> >
> >     In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
> >
> >     In [37]: ls -l uncompressed.dat
> >     -rw-r--r--  1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat
> >
> >     In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
> >
> >     In [39]: ls -l compressed.dat
> >     -rw-r--r--  1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat
> >
> Thanks, this helps me, but I think not enough, because the arrays I work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
>
> Greets,
>
> Joost
>
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

Re: [Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: Perry G. <pe...@st...> - 2005-09-09 20:52:13

On Sep 9, 2005, at 4:41 PM, Joost van Evert wrote:

> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
>>>>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
>>
>>     Joost> is it possible to use compression while storing
>>     Joost> numarray/Numeric objects?
>>
>>
>> Sure
>>
>>     In [35]: s = rand(10000)
>>
>>     In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
>>
>>     In [37]: ls -l uncompressed.dat
>>     -rw-r--r--  1 jdhunter jdhunter 80000 2005-09-09 15:04 
>> uncompressed.dat
>>
>>     In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
>>
>>     In [39]: ls -l compressed.dat
>>     -rw-r--r--  1 jdhunter jdhunter 41393 2005-09-09 15:04 
>> compressed.dat
>>
> Thanks, this helps me, but I think not enough, because the arrays I 
> work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
>
Well, it seems to me that you are asking for quite a lot if you expect 
it to work with memory-mapped arrays that are compressed (I'm assuming 
you mean that individual values are decompressed on the fly as they are 
needed). This is something that we gave some thought to a few years 
ago, but it seemed that supporting such capabilities was far too 
complicated, at least for now. Besides some operations are bound to 
blow up (e.g., take on a compressed array).

But I'm still not sure what you are trying to do and what you would 
like to see happen underneath. An example would do a lot to explain 
what your needs are.

Thanks, Perry Greenfield

Re: [Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: Joost v. E. <ph...@gm...> - 2005-09-09 20:28:58

On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
> >>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
> 
>     Joost> is it possible to use compression while storing
>     Joost> numarray/Numeric objects?
> 
> 
> Sure
> 
>     In [35]: s = rand(10000)
> 
>     In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
> 
>     In [37]: ls -l uncompressed.dat
>     -rw-r--r--  1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat
> 
>     In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
> 
>     In [39]: ls -l compressed.dat
>     -rw-r--r--  1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat
> 
Thanks, this helps me, but I think not enough, because the arrays I work
on are sometimes >1Gb(Correlation matrices). The tostring method would
explode the size, and result in a lot of swapping. Ideally the
compression also works with memmory mapped arrays.

Greets,

Joost

Re: [Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: John H. <jdh...@ac...> - 2005-09-09 20:07:48

>>>>> "Joost" == Joost van Evert <ph...@gm...> writes:

    Joost> is it possible to use compression while storing
    Joost> numarray/Numeric objects?


Sure

    In [35]: s = rand(10000)

    In [36]: file('uncompressed.dat', 'wb').write(s.tostring())

    In [37]: ls -l uncompressed.dat
    -rw-r--r--  1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat

    In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())

    In [39]: ls -l compressed.dat
    -rw-r--r--  1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat

Compression ration for more regular data will be better.

JDH

[Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

From: Joost v. E. <ph...@gm...> - 2005-09-09 20:01:22

Re: [Numpy-discussion] Linux to Windows porting question

From: Daniel S. <she...@un...> - 2005-09-07 20:40:25

The question was answered yesterday and that was the
answer> thanks

On Wed, 07 Sep 2005 15:09:04 +1200
  Greg Ewing <gre...@ca...> wrote:
> Daniel Sheltraw wrote:
> 
>>     blk = fromstring(f_fid.read(BLOCK_LEN), 
>> num_type).byteswapped().astype(Float32).tostring()
>> 
>> The error I get is:
>> 
>>     ValueError: string size must be a multiple of 
>>element size
> 
> Did you open the file in binary mode?
> 
> -- 
> Greg Ewing, Computer Science Dept, 
>+--------------------------------------+
> University of Canterbury,	   | A citizen of 
>NewZealandCorp, a	  |
> Christchurch, New Zealand	   | wholly-owned subsidiary 
>of USA Inc.  |
> gre...@ca...	 
>  +--------------------------------------+

Re: [Numpy-discussion] Linux to Windows porting question

From: Greg E. <gre...@ca...> - 2005-09-07 03:09:13

Daniel Sheltraw wrote:

>     blk = fromstring(f_fid.read(BLOCK_LEN), 
> num_type).byteswapped().astype(Float32).tostring()
> 
> The error I get is:
> 
>     ValueError: string size must be a multiple of element size

Did you open the file in binary mode?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
gre...@ca...	   +--------------------------------------+

Re: [Numpy-discussion] Linux to Windows porting question

From: Robert K. <rk...@uc...> - 2005-09-06 18:49:17

Daniel Sheltraw wrote:
> Hello NumPy Listees
> 
> I am trying to port some code to Windows that works fine under Linux.
> The offending line
> is:
> 
>     blk = fromstring(f_fid.read(BLOCK_LEN),
> num_type).byteswapped().astype(Float32).tostring()
> 
> The error I get is:
> 
>     ValueError: string size must be a multiple of element size
> 
> Does anyone have an idea where the problem might be? BLOCK_LEN is
> specified in bytes
> and num_type is Int32.

Is f_fid opened in binary mode?

  f_fid = open(filename, 'rb')

It should be.

-- 
Robert Kern
rk...@uc...

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter

[Numpy-discussion] Linux to Windows porting question

From: Daniel S. <she...@un...> - 2005-09-06 18:44:30

Hello NumPy Listees

I am trying to port some code to Windows that works fine 
under Linux. The offending line
is:

     blk = fromstring(f_fid.read(BLOCK_LEN), 
num_type).byteswapped().astype(Float32).tostring()

The error I get is:

     ValueError: string size must be a multiple of element 
size

Does anyone have an idea where the problem might be? 
BLOCK_LEN is specified in bytes
and num_type is Int32.

Thanks,
Daniel

[Numpy-discussion] Re: [SciPy-user] Re: [SciPy-dev] scipy core (Numeric3) win32 binaries to play with

From: Xavier G. <gn...@ob...> - 2005-09-05 09:01:21

Alan G Isaac wrote:

>On Fri, 02 Sep 2005, Travis Oliphant apparently wrote: 
>  
>
>>http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe 
>>    
>>
>
>So far so good.
>
>Thanks!
>Alan Isaac
>
>_______________________________________________
>SciPy-user mailing list
>Sci...@sc...
>http://www.scipy.net/mailman/listinfo/scipy-user
>
>  
>
Hi,

That's great news! :)
Where are the sources corresponding with this windows release (I would 
like to test that under linux asap)?
Is there any beta version documentation?

Thanks.
Xavier.

[Numpy-discussion] Re: [SciPy-dev] scipy core (Numeric3) win32 binaries to play with

From: Alan G I. <ai...@am...> - 2005-09-03 00:35:39

On Fri, 02 Sep 2005, Travis Oliphant apparently wrote:=20
> http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe=20

So far so good.

Thanks!
Alan Isaac

[Numpy-discussion] scipy core (Numeric3) win32 binaries to play with

From: Travis O. <oli...@ee...> - 2005-09-02 23:54:07

  <http://www.scipy.org/download/misc/folder_contents>
If anybody has just been waiting for a windows binary to try out the new 
Numeric (scipy.base) you can download this.

from scipy.base import *   (replaces from Numeric import *)


The installer is here:

http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe

<http://www.scipy.org/download/misc/folder_contents>

[Numpy-discussion] Re: l2 80 percent of our city underwater.

From: Lior C. <li...@fu...> - 2005-09-01 13:45:42

 on, the horse-handlers trotting towards the road leading black horses by =
plodded no farther than the fire post when he felt sick. He cried out lofty =
and special being. Lying down at his masters feet without even made the =
author of a novel which corresponds to the Gospel of Woland from Well, so I =
pinned the icon on my chest and ran... his head. stop the cancer! dust, =
chains clanking, and on their platforms men lay sprawled belly up on written =
all over in charcoal and pencil. 4. findirtctor: Typical Soviet contraction =
for financial director. learned doctors, then to quacks, and sometimes to =
fortune-tellers as well. confreres killed four soldiers, and, finally, the =
dirty traitor Judas - are said to have smothered St Philip, metropolitan of =
Moscow, with his own lifeless body lay with outstretched arms. The left foot =
was in a spot of heaving itself upon the earth, as happens only during world =
catastrophes. qualities, a dreamer and an eccentric. A girl fell in love with =
him, and he

Re: [Numpy-discussion] portability issue

From: <co...@ph...> - 2005-08-31 18:15:38

<pbt...@fr...> writes:

> hi !
>
> i try to transfer a pickle which contains numeric array, from a 64-bits
> system to a 32-bits system. it seems to fail due to bad (or lack of)
> conversion... more precisely, here is what i do on the 64-bits system :
>
> import Numeric,cPickle
> a=Numeric.array([1,2,3])
> f=open('test.pickle64','w')
> cPickle.dump(a,f)
> f.close()
>
> and here is what i try to do on the 32-bits system :
>
> import Numeric,cPickle
> f=open('test.pickle64','r')
> a=cPickle.load(f)
> f.close()
>
> and here is the log of the load :
>
>     a=cPickle.load(f)
>   File "/usr/lib/python2.3/site-packages/Numeric/Numeric.py", line 539, in
> array_constructor
>     x.shape = shape
> ValueError: ('total size of new array must be unchanged', <function
> array_constructor at 0x40a1002c>, ((3,), 'l',
> '\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00',True))
>
>
> Is there something to do to solve this difficulty ?

Specify the integer type with the number of bits.
Numeric.array([1,2,3]) will create an array with a typecode of 'l'
(Numeric.Int), which is the type that can hold Python ints (= C
longs). On your 64-bit system, it's a 64-bit integer; on the 64-bit,
it's a 32-bit integer. So, on the 32-bit system, when reading the
pickle, it sees an array of type 'l', but there is too much data to
fill the array it expects.

The solution is to explicitly create your array using a typecode that
gives the size of the integer. Either:

a = Numeric.array([1,2,3], Numeric.Int32)

or

a = Numeric.array([1,2,3], Numeric.Int64)

I haven't checked this, but I would think that using Int32 is better
if all your numbers will fit in that. Using 64-bit integers would mean
the 32-bit machine would have to use 'long long' types to do its math,
which would be slower, while using 32-bit integers would mean the
64-bit machine would use 'int', which would still be fast for it.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...

[Numpy-discussion] portability issue

From: <pbt...@fr...> - 2005-08-31 07:07:25

hi !

i try to transfer a pickle which contains numeric array, from a 64-bits
system to a 32-bits system. it seems to fail due to bad (or lack of)
conversion... more precisely, here is what i do on the 64-bits system :

import Numeric,cPickle
a=3DNumeric.array([1,2,3])
f=3Dopen('test.pickle64','w')
cPickle.dump(a,f)
f.close()

and here is what i try to do on the 32-bits system :

import Numeric,cPickle
f=3Dopen('test.pickle64','r')
a=3DcPickle.load(f)
f.close()

and here is the log of the load :

    a=3DcPickle.load(f)
  File "/usr/lib/python2.3/site-packages/Numeric/Numeric.py", line 539, i=
n
array_constructor
    x.shape =3D shape
ValueError: ('total size of new array must be unchanged', <function
array_constructor at 0x40a1002c>, ((3,), 'l',
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00=
\x00\x00\x00\x00\x00\x00',True))


Is there something to do to solve this difficulty ?

thanks

PB

Re: [Numpy-discussion] Matching Nueric3/numarray namig conentions.

From: Colin J. W. <cj...@sy...> - 2005-08-30 12:20:51

Travis Oliphant wrote:

> Nadav Horesh wrote:
>
>> Just started to play with Numeric3, looks as a significant usability
>> improvement but....
>> Same functions/classes are named differently in numarray and Numeric3,
>> for instance typecodes.
>>  
>>
> This is true for only a few cases.  Mostly the names are compatible, but
> some of the naming conventions needed changing...
> For example:
>
> We have used type for the name of the data type in a numeric array.  But,
> this can be confusing because type refers to the kind of Python object 
> and all arrays are the same kind of python object.  In addition, it is 
> natural to use the type= keyword in array constructors, but this then 
> blocks the use of that builtin for the function it is used with.  Of 
> course typecode was previously chosen by Numeric, but now the types 
> are not codes (they are really type objects).  Thus,  I have been 
> calling type (dtype) in the new scipy.base.  The alternative is to 
> keep the name type (eliminate the use of typecode, and rename python's 
> type function to pytype within scipy).
>
[error] - this should have read:
These changes make sense (1) replacing type by dtype (dType?) and (2)
replacing typecode by dType an instance of a Numeric types class.

It would be good if, as suggested by  Nadav,  the first change could be
made to numarray.  He indicates that the naming of the new Numeric types 
classes is different from that used by numarray.  Is it necessary to 
change this?

> It could easily be changed if that is a real problem.    Because of 
> the signficantly different usage of types in the new system, it is 
> helpful to have a different name (dtype).    But, I could be persuaded 
> to use the word type and rename Python's type to pytype.

This, I suggest, would be a step back.

Is there any plan to make Win32 binary version available for testing?
Past efforts to compile have failed.

Colin W,


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Numpy-discussion mailing list
Num...@li...
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Matching Nueric3/numarray namig conentions.

From: Colin J. W. <cj...@sy...> - 2005-08-30 12:10:39

Travis Oliphant wrote:

> Nadav Horesh wrote:
>
>> Just started to play with Numeric3, looks as a significant usability
>> improvement but....
>> Same functions/classes are named differently in numarray and Numeric3,
>> for instance typecodes.
>>  
>>
> This is true for only a few cases.  Mostly the names are compatible, but
> some of the naming conventions needed changing...
> For example:
>
> We have used type for the name of the data type in a numeric array.  But,
> this can be confusing because type refers to the kind of Python object 
> and all arrays are the same kind of python object.  In addition, it is 
> natural to use the type= keyword in array constructors, but this then 
> blocks the use of that builtin for the function it is used with.  Of 
> course typecode was previously chosen by Numeric, but now the types 
> are not codes (they are really type objects).  Thus,  I have been 
> calling type (dtype) in the new scipy.base.  The alternative is to 
> keep the name type (eliminate the use of typecode, and rename python's 
> type function to pytype within scipy).
>
These changes make sense (1) replacing type by dtype (dType?) and (2) 
replacing typecode by dType instances.

It would be good if, as suggested by  Nadav,  the first change could be 
made to numarray.

> It could easily be changed if that is a real problem.    Because of 
> the signficantly different usage of types in the new system, it is 
> helpful to have a different name (dtype).    But, I could be persuaded 
> to use the word type and rename Python's type to pytype.

This, I suggest, would be a step back.

Is there any plan to make Win32 binary version available for testing?  
Past efforts to compile have failed.

Colin W,

Re: [Numpy-discussion] Matching Nueric3/numarray namig conentions.

From: Nadav H. <Na...@Vi...> - 2005-08-30 07:59:59

I am not picky about which name to use. It is would be the same for me
if Jay Miller would add a support for dtype keyword, and switch Int32
for int32 (or vice versa). In this case you both agree that types should
be classes (although  Numeric3 types == type is better) and not strings.
Once there is an agreement on the functions, methods and keyword (for
instance should arange function have a shape keyword), the exact names
choice should be an easy issue to overcome.

  Nadav.

Travis Oliphant wrote:

> Nadav Horesh wrote:
>
>> Just started to play with Numeric3, looks as a significant usability
>> improvement but....
>> Same functions/classes are named differently in numarray and Numeric3,
>> for instance typecodes.
>>  
>>
> This is true for only a few cases.  Mostly the names are compatible, but
> some of the naming conventions needed changing...
> For example:
>
> We have used type for the name of the data type in a numeric array.  But,
> this can be confusing because type refers to the kind of Python object
> and all arrays are the same kind of python object.  In addition, it is
> natural to use the type= keyword in array constructors, but this then
> blocks the use of that builtin for the function it is used with.  Of
> course typecode was previously chosen by Numeric, but now the types
> are not codes (they are really type objects).  Thus,  I have been
> calling type (dtype) in the new scipy.base.  The alternative is to
> keep the name type (eliminate the use of typecode, and rename python's
> type function to pytype within scipy).
>
> It could easily be changed if that is a real problem.    Because of
> the signficantly different usage of types in the new system, it is
> helpful to have a different name (dtype).    But, I could be persuaded
> to use the word type and rename Python's type to pytype.
>
> -Travis
>
>
>
>

Re: [Numpy-discussion] Matching Nueric3/numarray namig conentions.

From: Travis O. <oli...@ee...> - 2005-08-30 05:58:06

Nadav Horesh wrote:

>Just started to play with Numeric3, looks as a significant usability
>improvement but....
>Same functions/classes are named differently in numarray and Numeric3,
>for instance typecodes.
>  
>
This is true for only a few cases.  Mostly the names are compatible, but
some of the naming conventions needed changing... 

For example:

We have used type for the name of the data type in a numeric array.  But,
this can be confusing because type refers to the kind of Python object 
and all arrays are the same kind of python object.  In addition, it is 
natural to use the type= keyword in array constructors, but this then 
blocks the use of that builtin for the function it is used with.  Of 
course typecode was previously chosen by Numeric, but now the types are 
not codes (they are really type objects).  Thus,  I have been calling 
type (dtype) in the new scipy.base.  The alternative is to keep the name 
type (eliminate the use of typecode, and rename python's type function 
to pytype within scipy).

It could easily be changed if that is a real problem.    Because of the 
signficantly different usage of types in the new system, it is helpful 
to have a different name (dtype).    But, I could be persuaded to use 
the word type and rename Python's type to pytype.

-Travis

[Numpy-discussion] Matching Nueric3/numarray namig conentions.

From: Nadav H. <Na...@Vi...> - 2005-08-30 05:45:34

Just started to play with Numeric3, looks as a significant usability
improvement but....
Same functions/classes are named differently in numarray and Numeric3,
for instance typecodes.

I thing that agreeing on the same names for identical functions/classes
would make the users life easier for either porting or alternating back
ends. I believe that it may help unifying the two projects.

  Nadav.

[Numpy-discussion] bug in numarray?

From: Nicolas G. <gr...@as...> - 2005-08-29 21:53:30

Attachments: readcol2.py cross_name.py cross_name2.py

		Hi,

I think there are a problem with numarray (not sure).

I'm trying to correlate two differents file to find the same object in 
both. To do this I wrote some ugly software and I'm using the 
readcol2.py to read the file in a numarray, numarray string or list format.

The cross_name.py is doing the cross correlation when I'm using the 
numarray string format. I'm using three parameters at differents columns 
and I compare all of them with something like:

numarray.all(a[i,:] == b[j,:])

I saw that my script is very very slow or to be more precise became to 
be slow. It's seems ok at the beginning but little by little is slow 
down by a huge amount. I let it turn all the week end and it found ~40 
000 objects (both files are ~200000 lines...) in common in two days.
I change the software to use the list in python and in some minutes 
I'have ~20 000 objects found in common. So I think there are a big 
problem probably: 1) in my script, perhaps 2) in numarray or 3) in both.


I hope to have explain the problem clearly ...


N.

ps: I print an output for the script cross_name.py to visually see the 
slow down and that appeard to became slow around the 700 objects in 
common but it's gradully decline.

[Numpy-discussion] bug in numarray?

From: Humufr <hu...@ya...> - 2005-08-29 19:17:00

      Hi,

I think there are a problem with numarray (not sure).

I'm trying to correlate two differents file to find the same object in 
both. To do this I wrote some ugly software and I'm using the 
readcol2.py to read the file in a numarray, numarray string or list format.

The cross_name.py is doing the cross correlation when I'm using the 
numarray string format. I'm using three parameters at differents columns 
and I compare all of them with something like:

numarray.all(a[i,:] == b[j,:])

I saw that my script is very very slow or to be more precise became to 
be slow. It's seems ok at the beginning but little by little is slow 
down by a huge amount. I let it turn all the week end and it found ~40 
000 objects (both files are ~200000 lines...) in common in two days.
I change the software to use the list in python and in some minutes 
I'have ~20 000 objects found in common. So I think there are a big 
problem probably: 1) in my script, perhaps 2) in numarray or 3) in both.


I hope to have explain the problem clearly ...


N.

ps: I print an output for the script cross_name.py to visually see the 
slow down and that appeard to became slow around the 700 objects in 
common but it's gradully decline.
pps: I join the different file I used. The cross_name.py is the function 
with the problem.


-------------------------------------
#readcol2.py
-------------------------------------
def 
readcol(fname,comments='%',columns=None,delimiter=None,dep=0,arraytype='list'):
     """
     Load ASCII data from fname into an array and return the array.

     The data must be regular, same number of values in every row

     fname can be a filename or a file handle.


     Input:

     - Fname : the name of the file to read

     Optionnal input:

     - comments : a string to indicate the charactor to delimit the 
domments.

                  the default is the matlab character '%'.

     - columns : list or tuple ho contains the columns to use.

     - delimiter : a string to delimit the columns

     - dep : an integer to indicate from which line you want to begin

             to use the file (useful to avoid the descriptions lines)

     - arraytype : a string to indicate which kind of array you want ot

                   have: numeric array (numeric) or character array 
(numstring) or list (list). By default it's the

                   list mode used
		
		

     matfile data is not currently supported, but see
     Nigel Wade's matfile ftp://ion.le.ac.uk/matfile/matfile.tar.gz

     Example usage:

     x,y = transpose(readcol('test.dat'))  # data in two columns

     X = readcol('test.dat')    # a matrix of data

     x = readcol('test.dat')    # a single column of data

     x = readcol('test.dat,'#') # the character use like a comment 
delimiter is '#'

     initial function from pylab, improve by myself for my need

     """
     from numarray import array,transpose


     fh = file(fname)

     X = []
     numCols = None
     nline = 0
     if columns is None:
         for line in fh:
             nline += 1
             if dep is not None and nline <= dep: continue
             line = line[:line.find(comments)].strip()
             if not len(line): continue
             if arraytype=='numeric':
                 row = [float(val) for val in line.split(delimiter)]
             else:
                 row = [val.strip() for val in line.split(delimiter)]
             thisLen = len(row)
             if numCols is not None and thisLen != numCols:
                 raise ValueError('All rows must have the same number of 
columns')
             X.append(row)
     else:
         for line in fh:
             nline +=1
             if dep is not None and nline <= dep: continue
             line = line[:line.find(comments)].strip()
             if not len(line): continue
             row = line.split(delimiter)
             if arraytype=='numeric':
                 row = [float(row[i-1]) for i in columns]
             elif arraytype=='numstring':
                 row = [row[i-1].strip() for i in columns]
             else:
	    	row = [row[i-1].strip() for i in columns]
	    thisLen = len(row)
	
	    	
	
             if numCols is not None and thisLen != numCols:
                 raise ValueError('All rows must have the same number of 
columns')
             X.append(row)

     if arraytype=='numeric':
         X = array(X)
     	r,c = X.shape
     	if r==1 or c==1:
         	X.shape = max([r,c]),
     elif arraytype == 'numstring':
         import numarray.strings               # pb si numeric+pylab
         X = numarray.strings.array(X)
     	r,c = X.shape
     	if r==1 or c==1:
         	X.shape = max([r,c]),
     	
     return X


----------------------------------------------------------------
#cross_name.py

----------------------------------------------------------------

#/usr/bin/env python

'''
	Software to cross correlate two files. To use it you had to file a 
params file
	who contains the information of the file you want to correlate.
	The information must have the format:
	   namefile = list of column ; delimiter
	
	example:
	   file1 = 1,2,3 ;
	   file2 = 20,19,21 ; ,
	
	no delimiter = blanck
'''

# there are a big problem of efficiency. The software is far to long 
with big file like SDSS.
# I had to find where is the problem

import sys
import numarray
import string

#read the params file
params = {}
for line in file(sys.argv[1],'rU'):
     line = line.strip()         # delete the end of line (\n on unix)
     if not len(line): continue  # is line empty do nothing and pass to 
the next line
     if line.startswith('#'): continue # test if the line is a comments 
(# is the character to signal it)
     tup = line.split('=',1)     # split the line, the delimiter is the 
sign =
     columns = [int(i) for i in 
tup[1].strip().split(';')[0].strip().split(',')] # creat a list who 
contains
     										# the columns we want to use
     delimiter = tup[1].strip().split(';')[1].strip()	# check the 
delimiter of the data file (generally space or coma)
     if not len(delimiter): delimiter = None
     params[tup[0].strip()] = { 'columns' : columns, 'delimiter' : 
delimiter}

# Read the data files (only the columns ask in the params file)
debut_data = 1
data = []
for namefile in params.iterkeys():
     import readcol2  #import the function to read the files
 
#data.append(readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=1,arraytype='character'))
     params[namefile]['data'] = 
readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=debut_data,arraytype='character')


# Read another times the data files to have all the lines!
# Question: like it's a dictionnary are we sure that the file are in the 
same order... Check it!!!!!!!!!
if len(params.keys()) == 2:
     namefile,data,delimiter = [],[],[]
     for keys in params.iterkeys():
         namefile.append(keys)
         data.append(params[keys]['data'])
         delim = params[keys]['delimiter']
         if delim != None:
             delimiter.append(params[keys]['delimiter'])
         else:
             delimiter.append('   ')
     #res_a = []
     #res_b = []

     f1_ini = file(namefile[0]).readlines()[debut_data:]
     f2_ini = file(namefile[1]).readlines()[debut_data:]

     #f1_ini = [line for line in file(namefile[0])][debut_data:]
     #f2_ini = [line for line in file(namefile[1])][debut_data:]

     f1=open('cross'+namefile[0],'w')
     f2=open('cross'+namefile[1],'w')
     f3=open('pastecross'+namefile[0]+namefile[1],'w')

     b_i = 0
     for a_i in range(data[0].shape[0]):
	for b_i in range(b_i,data[1].shape[0]):
             if numarray.all(data[0][a_i,:] == data[1][b_i,:]):
                 f1.write(f1_ini[a_i])
                 f2.write(f2_ini[b_i])
                 f3.write(f1_ini[a_i].strip()+delimiter[0]+' 
'+string.replace(f2_ini[b_i],delimiter[1],delimiter[0]))
                 del f2_ini[b_i]
                 break
                 #res_a.append(a_i)
                 #res_b.append(b_i)
     f1.close()
     f2.close()
     f3.close()
else:
     print "too much file: only two allowed for the moment"



#save the results in 3 files: 2 with the common objects from each file.
# one with a paste of the lines of the 2 initial files.

-----------------------------------------------------------------------

#cross_name2.py

---------------------------------------------------------------------
#/usr/bin/env python

'''
	Software to cross correlate two files. To use it you had to file a 
params file
	who contains the information of the file you want to correlate.
	The information must have the format:
	   namefile = list of column ; delimiter
	
	example:
	   file1 = 1,2,3 ;
	   file2 = 20,19,21 ; ,
	
	no delimiter = blanck
'''

# there are a big problem of efficiency. The software is far to long 
with big file like SDSS.
# I had to find where is the problem

import sys
import numarray
import string

#read the params file
params = {}
for line in file(sys.argv[1],'rU'):
     line = line.strip()         # delete the end of line (\n on unix)
     if not len(line): continue  # is line empty do nothing and pass to 
the next line
     if line.startswith('#'): continue # test if the line is a comments 
(# is the character to signal it)
     tup = line.split('=',1)     # split the line, the delimiter is the 
sign =
     columns = [int(i) for i in 
tup[1].strip().split(';')[0].strip().split(',')] # creat a list who 
contains
     										# the columns we want to use
     delimiter = tup[1].strip().split(';')[1].strip()	# check the 
delimiter of the data file (generally space or coma)
     if not len(delimiter): delimiter = None
     params[tup[0].strip()] = { 'columns' : columns, 'delimiter' : 
delimiter}

# Read the data files (only the columns ask in the params file)
debut_data = 1
data = []
for namefile in params.iterkeys():
     import readcol2  #import the function to read the files
 
#data.append(readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=1,arraytype='character'))
     params[namefile]['data'] = 
readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=debut_data,arraytype='list')


# Read another times the data files to have all the lines!
# Question: like it's a dictionnary are we sure that the file are in the 
same order... Check it!!!!!!!!!
if len(params.keys()) == 2:
     namefile,data,delimiter = [],[],[]
     for keys in params.iterkeys():
         namefile.append(keys)
         data.append(params[keys]['data'])
         delim = params[keys]['delimiter']
         if delim != None:
             delimiter.append(params[keys]['delimiter'])
         else:
             delimiter.append('   ')
     #res_a = []
     #res_b = []

     f1_ini = file(namefile[0]).readlines()[debut_data:]
     f2_ini = file(namefile[1]).readlines()[debut_data:]

     #f1_ini = [line for line in file(namefile[0])][debut_data:]
     #f2_ini = [line for line in file(namefile[1])][debut_data:]

     f1=open('cross'+namefile[0],'w')
     f2=open('cross'+namefile[1],'w')
     f3=open('pastecross'+namefile[0]+namefile[1],'w')

#     i=0
#     for a_i in range(len(data[0])):
#     	#print data[0][a_i,:]
#     	for b_i in range(len(data[1])):
# 		if data[0][a_i] == data[1][b_i]:
# 			print data[0][a_i],data[1][b_i]
# 			i+=1
# 			print i
# 			break
     b_i=0
     for a_i in range(len(data[0])):
	for b_i in range(b_i,len(data[1])):
             if data[0][a_i] == data[1][b_i]:
                 f1.write(f1_ini[a_i])
                 f2.write(f2_ini[b_i])
                 f3.write(f1_ini[a_i].strip()+delimiter[0]+' 
'+string.replace(f2_ini[b_i],delimiter[1],delimiter[0]))
                 del f2_ini[b_i]
                 break
                 #res_a.append(a_i)
                 #res_b.append(b_i)
     f1.close()
     f2.close()
     f3.close()
else:
     print "too much file: only two allowed for the moment"



#save the results in 3 files: 2 with the common objects from each file.
# one with a paste of the lines of the 2 initial files.

Re: [Numpy-discussion] is this a bug?

From: Todd M. <jm...@st...> - 2005-08-26 19:14:19

On Fri, 2005-08-26 at 13:24, Stefan Kuzminski wrote:
> >>> from numarray import *
> >>> x = ones(22400,Float)
> >>> print add.reduce(x)
> 22400.0
> >>> print add.reduce(x!=0)
> -128
> >>> print add.reduce((x!=0).astype(Int))
> 22400
> 
> it seems like the boolean result of the expression ( middle try )
> causes a problem?

This issue has been discussed before and the general consensus was that
this (somewhat treacherous) behavior should not change.

For array totals (reducing on all axes at once),  numarray has a sum()
method which by default does do a type promotion to the "max type of
kind", so integers -> Int64,  floats -> Float64, and complexes ->
Complex64 prior to the reduction.

Regards,
Todd

[Numpy-discussion] is this a bug?

From: Stefan K. <pon...@ya...> - 2005-08-26 17:25:07

>>> from numarray import *
>>> x = ones(22400,Float)
>>> print add.reduce(x)
22400.0
>>> print add.reduce(x!=0)
-128
>>> print add.reduce((x!=0).astype(Int))
22400

it seems like the boolean result of the expression ( middle try )
causes a problem?

thanks,
Stefan Kuzminski


		
____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs

258 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 280 281 282 283 284 .. 480 > >> (Page 282 of 480)