You can subscribe to this list here.
| 2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
| 2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
| 2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
| 2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
| 2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
| 2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
|
From: <sk...@po...> - 2005-09-11 15:25:27
|
Joost> is it possible to use compression while storing
Joost> numarray/Numeric objects?
Try the gzip or bz2 modules. Both have file-like objects that transparently
(de)compress data as it is read or written.
Joost> Ideally the compression also works with memmory mapped arrays.
Dunno, but probably not. You'll have to experiment.
Skip
|
|
From: Warren F. <fo...@sl...> - 2005-09-09 20:55:36
|
You may be able to avoid the tostring() overhead by using tofile():
s.tofile(gzip.open('compressed.dat', 'wb'))
You are probably SOL on the mmapping, though.
w
On Fri, 9 Sep 2005, Joost van Evert wrote:
> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
> > >>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
> >
> > Joost> is it possible to use compression while storing
> > Joost> numarray/Numeric objects?
> >
> >
> > Sure
> >
> > In [35]: s = rand(10000)
> >
> > In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
> >
> > In [37]: ls -l uncompressed.dat
> > -rw-r--r-- 1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat
> >
> > In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
> >
> > In [39]: ls -l compressed.dat
> > -rw-r--r-- 1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat
> >
> Thanks, this helps me, but I think not enough, because the arrays I work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
>
> Greets,
>
> Joost
>
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
|
|
From: Perry G. <pe...@st...> - 2005-09-09 20:52:13
|
On Sep 9, 2005, at 4:41 PM, Joost van Evert wrote:
> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
>>>>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
>>
>> Joost> is it possible to use compression while storing
>> Joost> numarray/Numeric objects?
>>
>>
>> Sure
>>
>> In [35]: s = rand(10000)
>>
>> In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
>>
>> In [37]: ls -l uncompressed.dat
>> -rw-r--r-- 1 jdhunter jdhunter 80000 2005-09-09 15:04
>> uncompressed.dat
>>
>> In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
>>
>> In [39]: ls -l compressed.dat
>> -rw-r--r-- 1 jdhunter jdhunter 41393 2005-09-09 15:04
>> compressed.dat
>>
> Thanks, this helps me, but I think not enough, because the arrays I
> work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
>
Well, it seems to me that you are asking for quite a lot if you expect
it to work with memory-mapped arrays that are compressed (I'm assuming
you mean that individual values are decompressed on the fly as they are
needed). This is something that we gave some thought to a few years
ago, but it seemed that supporting such capabilities was far too
complicated, at least for now. Besides some operations are bound to
blow up (e.g., take on a compressed array).
But I'm still not sure what you are trying to do and what you would
like to see happen underneath. An example would do a lot to explain
what your needs are.
Thanks, Perry Greenfield
|
|
From: Joost v. E. <ph...@gm...> - 2005-09-09 20:28:58
|
On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
> >>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
>
> Joost> is it possible to use compression while storing
> Joost> numarray/Numeric objects?
>
>
> Sure
>
> In [35]: s = rand(10000)
>
> In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
>
> In [37]: ls -l uncompressed.dat
> -rw-r--r-- 1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat
>
> In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
>
> In [39]: ls -l compressed.dat
> -rw-r--r-- 1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat
>
Thanks, this helps me, but I think not enough, because the arrays I work
on are sometimes >1Gb(Correlation matrices). The tostring method would
explode the size, and result in a lot of swapping. Ideally the
compression also works with memmory mapped arrays.
Greets,
Joost
|
|
From: John H. <jdh...@ac...> - 2005-09-09 20:07:48
|
>>>>> "Joost" == Joost van Evert <ph...@gm...> writes:
Joost> is it possible to use compression while storing
Joost> numarray/Numeric objects?
Sure
In [35]: s = rand(10000)
In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
In [37]: ls -l uncompressed.dat
-rw-r--r-- 1 jdhunter jdhunter 80000 2005-09-09 15:04 uncompressed.dat
In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
In [39]: ls -l compressed.dat
-rw-r--r-- 1 jdhunter jdhunter 41393 2005-09-09 15:04 compressed.dat
Compression ration for more regular data will be better.
JDH
|
|
From: Joost v. E. <ph...@gm...> - 2005-09-09 20:01:22
|
|
From: Daniel S. <she...@un...> - 2005-09-07 20:40:25
|
The question was answered yesterday and that was the answer> thanks On Wed, 07 Sep 2005 15:09:04 +1200 Greg Ewing <gre...@ca...> wrote: > Daniel Sheltraw wrote: > >> blk = fromstring(f_fid.read(BLOCK_LEN), >> num_type).byteswapped().astype(Float32).tostring() >> >> The error I get is: >> >> ValueError: string size must be a multiple of >>element size > > Did you open the file in binary mode? > > -- > Greg Ewing, Computer Science Dept, >+--------------------------------------+ > University of Canterbury, | A citizen of >NewZealandCorp, a | > Christchurch, New Zealand | wholly-owned subsidiary >of USA Inc. | > gre...@ca... > +--------------------------------------+ |
|
From: Greg E. <gre...@ca...> - 2005-09-07 03:09:13
|
Daniel Sheltraw wrote: > blk = fromstring(f_fid.read(BLOCK_LEN), > num_type).byteswapped().astype(Float32).tostring() > > The error I get is: > > ValueError: string size must be a multiple of element size Did you open the file in binary mode? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | gre...@ca... +--------------------------------------+ |
|
From: Robert K. <rk...@uc...> - 2005-09-06 18:49:17
|
Daniel Sheltraw wrote: > Hello NumPy Listees > > I am trying to port some code to Windows that works fine under Linux. > The offending line > is: > > blk = fromstring(f_fid.read(BLOCK_LEN), > num_type).byteswapped().astype(Float32).tostring() > > The error I get is: > > ValueError: string size must be a multiple of element size > > Does anyone have an idea where the problem might be? BLOCK_LEN is > specified in bytes > and num_type is Int32. Is f_fid opened in binary mode? f_fid = open(filename, 'rb') It should be. -- Robert Kern rk...@uc... "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter |
|
From: Daniel S. <she...@un...> - 2005-09-06 18:44:30
|
Hello NumPy Listees
I am trying to port some code to Windows that works fine
under Linux. The offending line
is:
blk = fromstring(f_fid.read(BLOCK_LEN),
num_type).byteswapped().astype(Float32).tostring()
The error I get is:
ValueError: string size must be a multiple of element
size
Does anyone have an idea where the problem might be?
BLOCK_LEN is specified in bytes
and num_type is Int32.
Thanks,
Daniel
|
|
From: Xavier G. <gn...@ob...> - 2005-09-05 09:01:21
|
Alan G Isaac wrote: >On Fri, 02 Sep 2005, Travis Oliphant apparently wrote: > > >>http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe >> >> > >So far so good. > >Thanks! >Alan Isaac > >_______________________________________________ >SciPy-user mailing list >Sci...@sc... >http://www.scipy.net/mailman/listinfo/scipy-user > > > Hi, That's great news! :) Where are the sources corresponding with this windows release (I would like to test that under linux asap)? Is there any beta version documentation? Thanks. Xavier. |
|
From: Alan G I. <ai...@am...> - 2005-09-03 00:35:39
|
On Fri, 02 Sep 2005, Travis Oliphant apparently wrote:=20 > http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe=20 So far so good. Thanks! Alan Isaac |
|
From: Travis O. <oli...@ee...> - 2005-09-02 23:54:07
|
<http://www.scipy.org/download/misc/folder_contents> If anybody has just been waiting for a windows binary to try out the new Numeric (scipy.base) you can download this. from scipy.base import * (replaces from Numeric import *) The installer is here: http://numeric.scipy.org/files/scipy_core-0.4.0.win32-py2.4.exe <http://www.scipy.org/download/misc/folder_contents> |
|
From: Lior C. <li...@fu...> - 2005-09-01 13:45:42
|
on, the horse-handlers trotting towards the road leading black horses by = plodded no farther than the fire post when he felt sick. He cried out lofty = and special being. Lying down at his masters feet without even made the = author of a novel which corresponds to the Gospel of Woland from Well, so I = pinned the icon on my chest and ran... his head. stop the cancer! dust, = chains clanking, and on their platforms men lay sprawled belly up on written = all over in charcoal and pencil. 4. findirtctor: Typical Soviet contraction = for financial director. learned doctors, then to quacks, and sometimes to = fortune-tellers as well. confreres killed four soldiers, and, finally, the = dirty traitor Judas - are said to have smothered St Philip, metropolitan of = Moscow, with his own lifeless body lay with outstretched arms. The left foot = was in a spot of heaving itself upon the earth, as happens only during world = catastrophes. qualities, a dreamer and an eccentric. A girl fell in love with = him, and he |
|
From: <co...@ph...> - 2005-08-31 18:15:38
|
<pbt...@fr...> writes:
> hi !
>
> i try to transfer a pickle which contains numeric array, from a 64-bits
> system to a 32-bits system. it seems to fail due to bad (or lack of)
> conversion... more precisely, here is what i do on the 64-bits system :
>
> import Numeric,cPickle
> a=Numeric.array([1,2,3])
> f=open('test.pickle64','w')
> cPickle.dump(a,f)
> f.close()
>
> and here is what i try to do on the 32-bits system :
>
> import Numeric,cPickle
> f=open('test.pickle64','r')
> a=cPickle.load(f)
> f.close()
>
> and here is the log of the load :
>
> a=cPickle.load(f)
> File "/usr/lib/python2.3/site-packages/Numeric/Numeric.py", line 539, in
> array_constructor
> x.shape = shape
> ValueError: ('total size of new array must be unchanged', <function
> array_constructor at 0x40a1002c>, ((3,), 'l',
> '\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00',True))
>
>
> Is there something to do to solve this difficulty ?
Specify the integer type with the number of bits.
Numeric.array([1,2,3]) will create an array with a typecode of 'l'
(Numeric.Int), which is the type that can hold Python ints (= C
longs). On your 64-bit system, it's a 64-bit integer; on the 64-bit,
it's a 32-bit integer. So, on the 32-bit system, when reading the
pickle, it sees an array of type 'l', but there is too much data to
fill the array it expects.
The solution is to explicitly create your array using a typecode that
gives the size of the integer. Either:
a = Numeric.array([1,2,3], Numeric.Int32)
or
a = Numeric.array([1,2,3], Numeric.Int64)
I haven't checked this, but I would think that using Int32 is better
if all your numbers will fit in that. Using 64-bit integers would mean
the 32-bit machine would have to use 'long long' types to do its math,
which would be slower, while using 32-bit integers would mean the
64-bit machine would use 'int', which would still be fast for it.
--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...
|
|
From: <pbt...@fr...> - 2005-08-31 07:07:25
|
hi !
i try to transfer a pickle which contains numeric array, from a 64-bits
system to a 32-bits system. it seems to fail due to bad (or lack of)
conversion... more precisely, here is what i do on the 64-bits system :
import Numeric,cPickle
a=3DNumeric.array([1,2,3])
f=3Dopen('test.pickle64','w')
cPickle.dump(a,f)
f.close()
and here is what i try to do on the 32-bits system :
import Numeric,cPickle
f=3Dopen('test.pickle64','r')
a=3DcPickle.load(f)
f.close()
and here is the log of the load :
a=3DcPickle.load(f)
File "/usr/lib/python2.3/site-packages/Numeric/Numeric.py", line 539, i=
n
array_constructor
x.shape =3D shape
ValueError: ('total size of new array must be unchanged', <function
array_constructor at 0x40a1002c>, ((3,), 'l',
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00=
\x00\x00\x00\x00\x00\x00',True))
Is there something to do to solve this difficulty ?
thanks
PB
|
|
From: Colin J. W. <cj...@sy...> - 2005-08-30 12:20:51
|
Travis Oliphant wrote: > Nadav Horesh wrote: > >> Just started to play with Numeric3, looks as a significant usability >> improvement but.... >> Same functions/classes are named differently in numarray and Numeric3, >> for instance typecodes. >> >> > This is true for only a few cases. Mostly the names are compatible, but > some of the naming conventions needed changing... > For example: > > We have used type for the name of the data type in a numeric array. But, > this can be confusing because type refers to the kind of Python object > and all arrays are the same kind of python object. In addition, it is > natural to use the type= keyword in array constructors, but this then > blocks the use of that builtin for the function it is used with. Of > course typecode was previously chosen by Numeric, but now the types > are not codes (they are really type objects). Thus, I have been > calling type (dtype) in the new scipy.base. The alternative is to > keep the name type (eliminate the use of typecode, and rename python's > type function to pytype within scipy). > [error] - this should have read: These changes make sense (1) replacing type by dtype (dType?) and (2) replacing typecode by dType an instance of a Numeric types class. It would be good if, as suggested by Nadav, the first change could be made to numarray. He indicates that the naming of the new Numeric types classes is different from that used by numarray. Is it necessary to change this? > It could easily be changed if that is a real problem. Because of > the signficantly different usage of types in the new system, it is > helpful to have a different name (dtype). But, I could be persuaded > to use the word type and rename Python's type to pytype. This, I suggest, would be a step back. Is there any plan to make Win32 binary version available for testing? Past efforts to compile have failed. Colin W, ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
|
From: Colin J. W. <cj...@sy...> - 2005-08-30 12:10:39
|
Travis Oliphant wrote: > Nadav Horesh wrote: > >> Just started to play with Numeric3, looks as a significant usability >> improvement but.... >> Same functions/classes are named differently in numarray and Numeric3, >> for instance typecodes. >> >> > This is true for only a few cases. Mostly the names are compatible, but > some of the naming conventions needed changing... > For example: > > We have used type for the name of the data type in a numeric array. But, > this can be confusing because type refers to the kind of Python object > and all arrays are the same kind of python object. In addition, it is > natural to use the type= keyword in array constructors, but this then > blocks the use of that builtin for the function it is used with. Of > course typecode was previously chosen by Numeric, but now the types > are not codes (they are really type objects). Thus, I have been > calling type (dtype) in the new scipy.base. The alternative is to > keep the name type (eliminate the use of typecode, and rename python's > type function to pytype within scipy). > These changes make sense (1) replacing type by dtype (dType?) and (2) replacing typecode by dType instances. It would be good if, as suggested by Nadav, the first change could be made to numarray. > It could easily be changed if that is a real problem. Because of > the signficantly different usage of types in the new system, it is > helpful to have a different name (dtype). But, I could be persuaded > to use the word type and rename Python's type to pytype. This, I suggest, would be a step back. Is there any plan to make Win32 binary version available for testing? Past efforts to compile have failed. Colin W, |
|
From: Nadav H. <Na...@Vi...> - 2005-08-30 07:59:59
|
I am not picky about which name to use. It is would be the same for me if Jay Miller would add a support for dtype keyword, and switch Int32 for int32 (or vice versa). In this case you both agree that types should be classes (although Numeric3 types == type is better) and not strings. Once there is an agreement on the functions, methods and keyword (for instance should arange function have a shape keyword), the exact names choice should be an easy issue to overcome. Nadav. Travis Oliphant wrote: > Nadav Horesh wrote: > >> Just started to play with Numeric3, looks as a significant usability >> improvement but.... >> Same functions/classes are named differently in numarray and Numeric3, >> for instance typecodes. >> >> > This is true for only a few cases. Mostly the names are compatible, but > some of the naming conventions needed changing... > For example: > > We have used type for the name of the data type in a numeric array. But, > this can be confusing because type refers to the kind of Python object > and all arrays are the same kind of python object. In addition, it is > natural to use the type= keyword in array constructors, but this then > blocks the use of that builtin for the function it is used with. Of > course typecode was previously chosen by Numeric, but now the types > are not codes (they are really type objects). Thus, I have been > calling type (dtype) in the new scipy.base. The alternative is to > keep the name type (eliminate the use of typecode, and rename python's > type function to pytype within scipy). > > It could easily be changed if that is a real problem. Because of > the signficantly different usage of types in the new system, it is > helpful to have a different name (dtype). But, I could be persuaded > to use the word type and rename Python's type to pytype. > > -Travis > > > > |
|
From: Travis O. <oli...@ee...> - 2005-08-30 05:58:06
|
Nadav Horesh wrote: >Just started to play with Numeric3, looks as a significant usability >improvement but.... >Same functions/classes are named differently in numarray and Numeric3, >for instance typecodes. > > This is true for only a few cases. Mostly the names are compatible, but some of the naming conventions needed changing... For example: We have used type for the name of the data type in a numeric array. But, this can be confusing because type refers to the kind of Python object and all arrays are the same kind of python object. In addition, it is natural to use the type= keyword in array constructors, but this then blocks the use of that builtin for the function it is used with. Of course typecode was previously chosen by Numeric, but now the types are not codes (they are really type objects). Thus, I have been calling type (dtype) in the new scipy.base. The alternative is to keep the name type (eliminate the use of typecode, and rename python's type function to pytype within scipy). It could easily be changed if that is a real problem. Because of the signficantly different usage of types in the new system, it is helpful to have a different name (dtype). But, I could be persuaded to use the word type and rename Python's type to pytype. -Travis |
|
From: Nadav H. <Na...@Vi...> - 2005-08-30 05:45:34
|
Just started to play with Numeric3, looks as a significant usability improvement but.... Same functions/classes are named differently in numarray and Numeric3, for instance typecodes. I thing that agreeing on the same names for identical functions/classes would make the users life easier for either porting or alternating back ends. I believe that it may help unifying the two projects. Nadav. |
|
From: Nicolas G. <gr...@as...> - 2005-08-29 21:53:30
|
Hi, I think there are a problem with numarray (not sure). I'm trying to correlate two differents file to find the same object in both. To do this I wrote some ugly software and I'm using the readcol2.py to read the file in a numarray, numarray string or list format. The cross_name.py is doing the cross correlation when I'm using the numarray string format. I'm using three parameters at differents columns and I compare all of them with something like: numarray.all(a[i,:] == b[j,:]) I saw that my script is very very slow or to be more precise became to be slow. It's seems ok at the beginning but little by little is slow down by a huge amount. I let it turn all the week end and it found ~40 000 objects (both files are ~200000 lines...) in common in two days. I change the software to use the list in python and in some minutes I'have ~20 000 objects found in common. So I think there are a big problem probably: 1) in my script, perhaps 2) in numarray or 3) in both. I hope to have explain the problem clearly ... N. ps: I print an output for the script cross_name.py to visually see the slow down and that appeard to became slow around the 700 objects in common but it's gradully decline. |
|
From: Humufr <hu...@ya...> - 2005-08-29 19:17:00
|
Hi,
I think there are a problem with numarray (not sure).
I'm trying to correlate two differents file to find the same object in
both. To do this I wrote some ugly software and I'm using the
readcol2.py to read the file in a numarray, numarray string or list format.
The cross_name.py is doing the cross correlation when I'm using the
numarray string format. I'm using three parameters at differents columns
and I compare all of them with something like:
numarray.all(a[i,:] == b[j,:])
I saw that my script is very very slow or to be more precise became to
be slow. It's seems ok at the beginning but little by little is slow
down by a huge amount. I let it turn all the week end and it found ~40
000 objects (both files are ~200000 lines...) in common in two days.
I change the software to use the list in python and in some minutes
I'have ~20 000 objects found in common. So I think there are a big
problem probably: 1) in my script, perhaps 2) in numarray or 3) in both.
I hope to have explain the problem clearly ...
N.
ps: I print an output for the script cross_name.py to visually see the
slow down and that appeard to became slow around the 700 objects in
common but it's gradully decline.
pps: I join the different file I used. The cross_name.py is the function
with the problem.
-------------------------------------
#readcol2.py
-------------------------------------
def
readcol(fname,comments='%',columns=None,delimiter=None,dep=0,arraytype='list'):
"""
Load ASCII data from fname into an array and return the array.
The data must be regular, same number of values in every row
fname can be a filename or a file handle.
Input:
- Fname : the name of the file to read
Optionnal input:
- comments : a string to indicate the charactor to delimit the
domments.
the default is the matlab character '%'.
- columns : list or tuple ho contains the columns to use.
- delimiter : a string to delimit the columns
- dep : an integer to indicate from which line you want to begin
to use the file (useful to avoid the descriptions lines)
- arraytype : a string to indicate which kind of array you want ot
have: numeric array (numeric) or character array
(numstring) or list (list). By default it's the
list mode used
matfile data is not currently supported, but see
Nigel Wade's matfile ftp://ion.le.ac.uk/matfile/matfile.tar.gz
Example usage:
x,y = transpose(readcol('test.dat')) # data in two columns
X = readcol('test.dat') # a matrix of data
x = readcol('test.dat') # a single column of data
x = readcol('test.dat,'#') # the character use like a comment
delimiter is '#'
initial function from pylab, improve by myself for my need
"""
from numarray import array,transpose
fh = file(fname)
X = []
numCols = None
nline = 0
if columns is None:
for line in fh:
nline += 1
if dep is not None and nline <= dep: continue
line = line[:line.find(comments)].strip()
if not len(line): continue
if arraytype=='numeric':
row = [float(val) for val in line.split(delimiter)]
else:
row = [val.strip() for val in line.split(delimiter)]
thisLen = len(row)
if numCols is not None and thisLen != numCols:
raise ValueError('All rows must have the same number of
columns')
X.append(row)
else:
for line in fh:
nline +=1
if dep is not None and nline <= dep: continue
line = line[:line.find(comments)].strip()
if not len(line): continue
row = line.split(delimiter)
if arraytype=='numeric':
row = [float(row[i-1]) for i in columns]
elif arraytype=='numstring':
row = [row[i-1].strip() for i in columns]
else:
row = [row[i-1].strip() for i in columns]
thisLen = len(row)
if numCols is not None and thisLen != numCols:
raise ValueError('All rows must have the same number of
columns')
X.append(row)
if arraytype=='numeric':
X = array(X)
r,c = X.shape
if r==1 or c==1:
X.shape = max([r,c]),
elif arraytype == 'numstring':
import numarray.strings # pb si numeric+pylab
X = numarray.strings.array(X)
r,c = X.shape
if r==1 or c==1:
X.shape = max([r,c]),
return X
----------------------------------------------------------------
#cross_name.py
----------------------------------------------------------------
#/usr/bin/env python
'''
Software to cross correlate two files. To use it you had to file a
params file
who contains the information of the file you want to correlate.
The information must have the format:
namefile = list of column ; delimiter
example:
file1 = 1,2,3 ;
file2 = 20,19,21 ; ,
no delimiter = blanck
'''
# there are a big problem of efficiency. The software is far to long
with big file like SDSS.
# I had to find where is the problem
import sys
import numarray
import string
#read the params file
params = {}
for line in file(sys.argv[1],'rU'):
line = line.strip() # delete the end of line (\n on unix)
if not len(line): continue # is line empty do nothing and pass to
the next line
if line.startswith('#'): continue # test if the line is a comments
(# is the character to signal it)
tup = line.split('=',1) # split the line, the delimiter is the
sign =
columns = [int(i) for i in
tup[1].strip().split(';')[0].strip().split(',')] # creat a list who
contains
# the columns we want to use
delimiter = tup[1].strip().split(';')[1].strip() # check the
delimiter of the data file (generally space or coma)
if not len(delimiter): delimiter = None
params[tup[0].strip()] = { 'columns' : columns, 'delimiter' :
delimiter}
# Read the data files (only the columns ask in the params file)
debut_data = 1
data = []
for namefile in params.iterkeys():
import readcol2 #import the function to read the files
#data.append(readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=1,arraytype='character'))
params[namefile]['data'] =
readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=debut_data,arraytype='character')
# Read another times the data files to have all the lines!
# Question: like it's a dictionnary are we sure that the file are in the
same order... Check it!!!!!!!!!
if len(params.keys()) == 2:
namefile,data,delimiter = [],[],[]
for keys in params.iterkeys():
namefile.append(keys)
data.append(params[keys]['data'])
delim = params[keys]['delimiter']
if delim != None:
delimiter.append(params[keys]['delimiter'])
else:
delimiter.append(' ')
#res_a = []
#res_b = []
f1_ini = file(namefile[0]).readlines()[debut_data:]
f2_ini = file(namefile[1]).readlines()[debut_data:]
#f1_ini = [line for line in file(namefile[0])][debut_data:]
#f2_ini = [line for line in file(namefile[1])][debut_data:]
f1=open('cross'+namefile[0],'w')
f2=open('cross'+namefile[1],'w')
f3=open('pastecross'+namefile[0]+namefile[1],'w')
b_i = 0
for a_i in range(data[0].shape[0]):
for b_i in range(b_i,data[1].shape[0]):
if numarray.all(data[0][a_i,:] == data[1][b_i,:]):
f1.write(f1_ini[a_i])
f2.write(f2_ini[b_i])
f3.write(f1_ini[a_i].strip()+delimiter[0]+'
'+string.replace(f2_ini[b_i],delimiter[1],delimiter[0]))
del f2_ini[b_i]
break
#res_a.append(a_i)
#res_b.append(b_i)
f1.close()
f2.close()
f3.close()
else:
print "too much file: only two allowed for the moment"
#save the results in 3 files: 2 with the common objects from each file.
# one with a paste of the lines of the 2 initial files.
-----------------------------------------------------------------------
#cross_name2.py
---------------------------------------------------------------------
#/usr/bin/env python
'''
Software to cross correlate two files. To use it you had to file a
params file
who contains the information of the file you want to correlate.
The information must have the format:
namefile = list of column ; delimiter
example:
file1 = 1,2,3 ;
file2 = 20,19,21 ; ,
no delimiter = blanck
'''
# there are a big problem of efficiency. The software is far to long
with big file like SDSS.
# I had to find where is the problem
import sys
import numarray
import string
#read the params file
params = {}
for line in file(sys.argv[1],'rU'):
line = line.strip() # delete the end of line (\n on unix)
if not len(line): continue # is line empty do nothing and pass to
the next line
if line.startswith('#'): continue # test if the line is a comments
(# is the character to signal it)
tup = line.split('=',1) # split the line, the delimiter is the
sign =
columns = [int(i) for i in
tup[1].strip().split(';')[0].strip().split(',')] # creat a list who
contains
# the columns we want to use
delimiter = tup[1].strip().split(';')[1].strip() # check the
delimiter of the data file (generally space or coma)
if not len(delimiter): delimiter = None
params[tup[0].strip()] = { 'columns' : columns, 'delimiter' :
delimiter}
# Read the data files (only the columns ask in the params file)
debut_data = 1
data = []
for namefile in params.iterkeys():
import readcol2 #import the function to read the files
#data.append(readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=1,arraytype='character'))
params[namefile]['data'] =
readcol2.readcol(namefile,columns=params[namefile]['columns'],comments='#',delimiter=params[namefile]['delimiter'],dep=debut_data,arraytype='list')
# Read another times the data files to have all the lines!
# Question: like it's a dictionnary are we sure that the file are in the
same order... Check it!!!!!!!!!
if len(params.keys()) == 2:
namefile,data,delimiter = [],[],[]
for keys in params.iterkeys():
namefile.append(keys)
data.append(params[keys]['data'])
delim = params[keys]['delimiter']
if delim != None:
delimiter.append(params[keys]['delimiter'])
else:
delimiter.append(' ')
#res_a = []
#res_b = []
f1_ini = file(namefile[0]).readlines()[debut_data:]
f2_ini = file(namefile[1]).readlines()[debut_data:]
#f1_ini = [line for line in file(namefile[0])][debut_data:]
#f2_ini = [line for line in file(namefile[1])][debut_data:]
f1=open('cross'+namefile[0],'w')
f2=open('cross'+namefile[1],'w')
f3=open('pastecross'+namefile[0]+namefile[1],'w')
# i=0
# for a_i in range(len(data[0])):
# #print data[0][a_i,:]
# for b_i in range(len(data[1])):
# if data[0][a_i] == data[1][b_i]:
# print data[0][a_i],data[1][b_i]
# i+=1
# print i
# break
b_i=0
for a_i in range(len(data[0])):
for b_i in range(b_i,len(data[1])):
if data[0][a_i] == data[1][b_i]:
f1.write(f1_ini[a_i])
f2.write(f2_ini[b_i])
f3.write(f1_ini[a_i].strip()+delimiter[0]+'
'+string.replace(f2_ini[b_i],delimiter[1],delimiter[0]))
del f2_ini[b_i]
break
#res_a.append(a_i)
#res_b.append(b_i)
f1.close()
f2.close()
f3.close()
else:
print "too much file: only two allowed for the moment"
#save the results in 3 files: 2 with the common objects from each file.
# one with a paste of the lines of the 2 initial files.
|
|
From: Todd M. <jm...@st...> - 2005-08-26 19:14:19
|
On Fri, 2005-08-26 at 13:24, Stefan Kuzminski wrote: > >>> from numarray import * > >>> x = ones(22400,Float) > >>> print add.reduce(x) > 22400.0 > >>> print add.reduce(x!=0) > -128 > >>> print add.reduce((x!=0).astype(Int)) > 22400 > > it seems like the boolean result of the expression ( middle try ) > causes a problem? This issue has been discussed before and the general consensus was that this (somewhat treacherous) behavior should not change. For array totals (reducing on all axes at once), numarray has a sum() method which by default does do a type promotion to the "max type of kind", so integers -> Int64, floats -> Float64, and complexes -> Complex64 prior to the reduction. Regards, Todd |
|
From: Stefan K. <pon...@ya...> - 2005-08-26 17:25:07
|
>>> from numarray import * >>> x = ones(22400,Float) >>> print add.reduce(x) 22400.0 >>> print add.reduce(x!=0) -128 >>> print add.reduce((x!=0).astype(Int)) 22400 it seems like the boolean result of the expression ( middle try ) causes a problem? thanks, Stefan Kuzminski ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs |