You can subscribe to this list here.
| 2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
| 2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
| 2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
| 2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
| 2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
| 2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
|
From: Andrea R. <ari...@pi...> - 2005-09-23 07:02:21
|
On Sep 23, 2005, at 5:23 AM, Greg Ewing wrote: > I wish people would stop suggesting the 'a and b or c' trick, > because it DOESN'T WORK except in special circumstances (i.e. > when you can be sure that b is never false). > > What you want is: > > def f(x): > if x <= a: > return f1(x) > else: > return f2(x) It doesn't work either. As I've already explained x is an array containing values both above and below a! What I really need is a way to prevent f1 and f2 from acting on those values of the 'x' array for which the functions are not defined. Any other hints? Andrea. |
|
From: Greg E. <gre...@ca...> - 2005-09-23 03:23:34
|
Andrea Riciputi wrote:
> On Sep 22, 2005, at 9:33 PM, Alan G Isaac wrote:
>
>
>> def f(x):
>> return x<=a and f1(x) or f2(x)
>
> I've already tried something like this, but it doesn't work
I wish people would stop suggesting the 'a and b or c' trick,
because it DOESN'T WORK except in special circumstances (i.e.
when you can be sure that b is never false).
What you want is:
def f(x):
if x <= a:
return f1(x)
else:
return f2(x)
--
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury, | A citizen of NewZealandCorp, a |
Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. |
gre...@ca... +--------------------------------------+
|
|
From: Alan G I. <ai...@am...> - 2005-09-22 22:17:16
|
On Thu, 22 Sep 2005, Andrea Riciputi apparently wrote:=20 > I've already tried something like this, but it doesn't work since f1=20 > and f2 return not valid values outside the range over they are=20 > defined. Perhaps an example could clarify; suppose that f1(x) =3D 1./=20 > sqrt(1 - x**2) for x <=3D 1, and f2(x) =3D 1./sqrt(x**2 - 1) for x > 1.= =20 > Your suggestion, as the other I've tried, fails with a=20 > "OverflowError: math range error".=20 If you do it as I suggested, they should not I believe be=20 evaluated outside of their range. So your function must be=20 generating an overflow error within this range. >>> import math >>> import random >>> def f1(x): return math.sqrt(1-x**2) ... >>> def f2(x): return 1./math.sqrt(x**2-1) ... >>> def f(x): return x<=3D1 and f1(x) or f2(x) ... >>> d =3D [random.uniform(0,2) for i in range(20)] >>> fd =3D [f(x) for x in d] Works fine. Cheers, Alan Isaac |
|
From: Andrea R. <ari...@pi...> - 2005-09-22 21:24:58
|
I've already tried something like this, but it doesn't work since f1 and f2 return not valid values outside the range over they are defined. Perhaps an example could clarify; suppose that f1(x) = 1./ sqrt(1 - x**2) for x <= 1, and f2(x) = 1./sqrt(x**2 - 1) for x > 1. Your suggestion, as the other I've tried, fails with a "OverflowError: math range error". Any helps? Andrea. On Sep 22, 2005, at 9:33 PM, Alan G Isaac wrote: > On Thu, 22 Sep 2005, Andrea Riciputi apparently wrote: > >> this is probably an already discussed problem, but I've not been able >> to find a solution even after googling a lot. >> > > >> I've a piecewise defined function: >> > > >> / >> | f1(x) if x <= a >> f(x) = | >> | f2(x) if x > a >> \ >> > > >> where f1 and f2 are not defined outside the above range. How can I >> define such a function in Python in order to apply (map) it to an >> array ranging from values smaller to values bigger than a? >> > > I suspect I do not understand your question. > But perhaps you want this: > > def f(x): > return x<=a and f1(x) or f2(x) > > fwiw, > Alan > > > > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. > Download it for free - -and be entered to win a 42" plasma tv or > your very > own Sony(tm)PSP. Click here to play: http://sourceforge.net/ > geronimo.php > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
|
From: Peter V. <ve...@em...> - 2005-09-22 20:11:52
|
I think you are correct: The result of a gaussian filter of order one
should be a derivative operator. Thus the response to an array
created with arange() should be close to one (barring edge effects).
Currently we have:
>>> from numarray import *
>>> from numarray.nd_image import gaussian_filter
>>> a = arange(10, type = Float64)
>>> gaussian_filter(a, 1.0, order = 1)
array([-0.36378461, -0.84938238, -0.98502645, -0.99939268, -0.999928 ,
-0.999928 , -0.99939268, -0.98502645, -0.84938238,
-0.36378461])
So the sign is wrong, that can be fixed by mirroring the gaussian
kernels. I have done so in CVS. The same holds true for the Sobel and
Prewitt filters, they were also defined 'incorrectly' according to
this criterion. I also changed those. That may be a bit more
controversial since a quick look on the web seemed to indicate that
often it is defined the other around. If anybody thinks my changes
are no good, please let me know.
Cheers, Peter
On 22 Sep, 2005, at 17:41, Alexandre Guimond wrote:
> Hi.
>
> I think I found a bug in gaussian_filter1d.
>
> roughly the function creates a 1d kernel and then calls
> correlate1d. The problem I see is that the kernel should be
> mirrored prior to calling correlate1d since we want to convolve,
> not correlate.
>
> as a result:
>
> >>> import numarray.nd_image
> >>> numarray.nd_image.gaussian_filter1d( ( 0.0, 1.0, 0.0 ), 1,
> order = 1, axis = 0, mode = 'constant' )
> array([-0.24197145, 0. , 0.24197145])
> >>>
>
> when it should be [ 0.24197145, 0. , -0.24197145]) (notice
> the change in the sign of coefficients)
>
> Or did I get that wrong?
>
> alex.
|
|
From: Alan G I. <ai...@am...> - 2005-09-22 19:30:02
|
On Thu, 22 Sep 2005, Andrea Riciputi apparently wrote:=20
> this is probably an already discussed problem, but I've not been able=20
> to find a solution even after googling a lot.=20
> I've a piecewise defined function:=20
> /=20
> | f1(x) if x <=3D a=20
> f(x) =3D |=20
> | f2(x) if x > a=20
> \=20
> where f1 and f2 are not defined outside the above range. How can I=20
> define such a function in Python in order to apply (map) it to an=20
> array ranging from values smaller to values bigger than a?=20
I suspect I do not understand your question.
But perhaps you want this:
def f(x):
return x<=3Da and f1(x) or f2(x)
fwiw,
Alan
|
|
From: Todd M. <jm...@st...> - 2005-09-22 16:41:39
|
Thanks Nadav. This is fixed in CVS. Regards, Todd Nadav Horesh wrote: >It seems that the tostring method fails on rank 0 arrays: > >a = N.array(-4) > > >>>>a >>>> >>>> >array(-4) > > >>>>a.tostring() >>>> >>>> > >Traceback (most recent call last): > File "<pyshell#18>", line 1, in -toplevel- > a.tostring() > File "/usr/local/lib/python2.4/site-packages/numarray/generic.py", >line 746, in tostring > self._strides, self._itemsize) >MemoryError > > >>>>N.__version__ >>>> >>>> >'1.4.0' > > > Nadav. > > > |
|
From: Alexandre G. <gu...@gu...> - 2005-09-22 15:48:29
|
Hi. I think I found a bug in gaussian_filter1d. roughly the function creates a 1d kernel and then calls correlate1d. The problem I see is that the kernel should be mirrored prior to calling correlate1d since we want to convolve, not correlate. as a result: >>> import numarray.nd_image >>> numarray.nd_image.gaussian_filter1d( ( 0.0, 1.0, 0.0 ), 1, order =3D 1, axis =3D 0, mode =3D 'constant' ) array([-0.24197145, 0. , 0.24197145]) >>> when it should be [ 0.24197145, 0. , -0.24197145]) (notice the change in the sign of coefficients) Or did I get that wrong? alex. |
|
From: Andrea R. <ari...@pi...> - 2005-09-22 15:45:29
|
Hi all,
this is probably an already discussed problem, but I've not been able
to find a solution even after googling a lot.
I've a piecewise defined function:
/
| f1(x) if x <= a
f(x) = |
| f2(x) if x > a
\
where f1 and f2 are not defined outside the above range. How can I
define such a function in Python in order to apply (map) it to an
array ranging from values smaller to values bigger than a?
Thanks,
Andrea.
|
|
From: Humufr <hu...@ya...> - 2005-09-20 18:39:50
|
Thank you very much. I saw no answer before. It's why I reduce a lot the sample :) I'll try it now Todd Miller wrote: > Hi H, > > I did some work on this problem based on your previous post but > apparently my response never made it to numpy-discussion. In a > nutshell, I made numarray 12x faster for a benchmark like your > numarray_pb_sample.py by speeding up string comparisons and improving > all(). The changes are in numarray CVS but there is no Source Forge > release that contains them yet. numarray-1.4.0 is still several > weeks away. If you want to try CVS from UNIX/Linux just do: > > % cvs -d:pserver:ano...@cv...:/cvsroot/numpy login > % cvs -z3 -d:pserver:ano...@cv...:/cvsroot/numpy co > -P numarray > > Regards, > Todd > > Humufr wrote: > >> Hello, >> >> I have a problem with numarray and especially the function numarray.all. >> >> I want to compare two files to do this I read the files with a >> function readcol2 who can put them in a list or numarray format >> (string or numerical). >> >> I'm doing a comparaison on each line of the file. >> If I'm using the array format and the numarray.all function, that >> take forever to do the comparaison for 2 big files. If I'm using >> python list object, it's very fast. I think there are some problem or >> at least some improvement to do. If I understand correctly the goal >> of numarray, it has been write to speed up some part of python but >> here it slow down a lot. >> >> An very simple sample to see the effect is at the bottom of this mail. >> >> Thanks for numarray, I hope to not bother you. My comments are more >> to improve numarray than other things. I have been able to find the >> problem so no I can avoied it. >> >> H. >> >> >> >> >> def >> readcol(fname,comments='%',columns=None,delimiter=None,dep=0,arraytype='list'): >> >> """ >> Load ASCII data from fname into an array and return the array. >> The data must be regular, same number of values in every row >> fname can be a filename or a file handle. >> >> Input: >> >> - Fname : the name of the file to read >> >> Optionnal input: >> - comments : a string to indicate the charactor to delimit the >> domments. >> the default is the matlab character '%'. >> - columns : list or tuple ho contains the columns to use. >> - delimiter : a string to delimit the columns >> >> - dep : an integer to indicate from which line you want to begin >> >> to use the file (useful to avoid the descriptions lines) >> >> - arraytype : a string to indicate which kind of array you want ot >> have: numeric array (numeric) or character array >> (numstring) or list (list). By default it's the >> >> list mode used >> matfile data is not currently supported, but see >> Nigel Wade's matfile ftp://ion.le.ac.uk/matfile/matfile.tar.gz >> >> Example usage: >> >> x,y = transpose(readcol('test.dat')) # data in two columns >> >> X = readcol('test.dat') # a matrix of data >> >> x = readcol('test.dat') # a single column of data >> >> x = readcol('test.dat,'#') # the character use like a comment >> delimiter is '#' >> >> initial function from pylab (J.Hunter). Change by myself for my >> specific need >> >> """ >> from numarray import array,transpose >> >> fh = file(fname) >> >> X = [] >> numCols = None >> nline = 0 >> if columns is None: >> for line in fh: >> nline += 1 >> if dep is not None and nline <= dep: continue >> line = line[:line.find(comments)].strip() >> if not len(line): continue >> if arraytype=='numeric': >> row = [float(val) for val in line.split(delimiter)] >> else: >> row = [val.strip() for val in line.split(delimiter)] >> thisLen = len(row) >> if numCols is not None and thisLen != numCols: >> raise ValueError('All rows must have the same number >> of columns') >> X.append(row) >> else: >> for line in fh: >> nline +=1 >> if dep is not None and nline <= dep: continue >> line = line[:line.find(comments)].strip() >> if not len(line): continue >> row = line.split(delimiter) >> if arraytype=='numeric': >> row = [float(row[i-1]) for i in columns] >> elif arraytype=='numstring': >> row = [row[i-1].strip() for i in columns] >> else: >> row = [row[i-1].strip() for i in columns] >> thisLen = len(row) >> if numCols is not None and thisLen != numCols: >> raise ValueError('All rows must have the same number >> of columns') >> X.append(row) >> >> if arraytype=='numeric': >> X = array(X) >> r,c = X.shape >> if r==1 or c==1: >> X.shape = max([r,c]), >> elif arraytype == 'numstring': >> import numarray.strings # pb if numeric+pylab >> X = numarray.strings.array(X) >> r,c = X.shape >> if r==1 or c==1: >> X.shape = max([r,c]), >> return X >> >> >> ------------------------------------------- >> files_test_creation.py >> >> ------------------------------------------- >> >> f1 = file('test1.dat','w') >> for i in range(10000): >> f1.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n') >> f1.close() >> >> >> f2 = file('test2.dat','w') >> for i in range(10000): >> f2.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n') >> f2.close() >> >> ------------------------------------------- >> numarray_pb_sample.py >> >> ------------------------------------------- >> >> import numarray >> data1 = >> readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='numstring') >> data2 = >> readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='numstring') >> >> #or in non string array form (same result) >> ## data1 = >> readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='numeric') >> ## data2 = >> readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='numeric') >> >> for a_i in range(data1.shape[0]): >> for b_i in range(data2.shape[0]): >> if numarray.all(data1[a_i,:] == data2[b_i,:]): >> print a_i,b_i >> >> ------------------------------------------- >> python_list_sample.py >> >> ------------------------------------------- >> >> data1 = >> readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='list') >> data2 = >> readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' >> ',dep=1,arraytype='list') >> >> for a_i in range(len(data1)): >> for b_i in range(len(data2)): >> if data1[a_i] == data2[b_i]: >> print a_i,b_i >> >> >> >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: >> Tame your development challenges with Apache's Geronimo App Server. >> Download it for free - -and be entered to win a 42" plasma tv or your >> very >> own Sony(tm)PSP. Click here to play: >> http://sourceforge.net/geronimo.php >> _______________________________________________ >> Numpy-discussion mailing list >> Num...@li... >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > |
|
From: Todd M. <jm...@st...> - 2005-09-20 17:16:32
|
Hi H, I did some work on this problem based on your previous post but apparently my response never made it to numpy-discussion. In a nutshell, I made numarray 12x faster for a benchmark like your numarray_pb_sample.py by speeding up string comparisons and improving all(). The changes are in numarray CVS but there is no Source Forge release that contains them yet. numarray-1.4.0 is still several weeks away. If you want to try CVS from UNIX/Linux just do: % cvs -d:pserver:ano...@cv...:/cvsroot/numpy login % cvs -z3 -d:pserver:ano...@cv...:/cvsroot/numpy co -P numarray Regards, Todd Humufr wrote: > Hello, > > I have a problem with numarray and especially the function numarray.all. > > I want to compare two files to do this I read the files with a > function readcol2 who can put them in a list or numarray format > (string or numerical). > > I'm doing a comparaison on each line of the file. > If I'm using the array format and the numarray.all function, that take > forever to do the comparaison for 2 big files. If I'm using python > list object, it's very fast. I think there are some problem or at > least some improvement to do. If I understand correctly the goal of > numarray, it has been write to speed up some part of python but here > it slow down a lot. > > An very simple sample to see the effect is at the bottom of this mail. > > Thanks for numarray, I hope to not bother you. My comments are more to > improve numarray than other things. I have been able to find the > problem so no I can avoied it. > > H. > > > > > def > readcol(fname,comments='%',columns=None,delimiter=None,dep=0,arraytype='list'): > > """ > Load ASCII data from fname into an array and return the array. > The data must be regular, same number of values in every row > fname can be a filename or a file handle. > > Input: > > - Fname : the name of the file to read > > Optionnal input: > - comments : a string to indicate the charactor to delimit the > domments. > the default is the matlab character '%'. > - columns : list or tuple ho contains the columns to use. > - delimiter : a string to delimit the columns > > - dep : an integer to indicate from which line you want to begin > > to use the file (useful to avoid the descriptions lines) > > - arraytype : a string to indicate which kind of array you want ot > have: numeric array (numeric) or character array > (numstring) or list (list). By default it's the > > list mode used > > matfile data is not currently supported, but see > Nigel Wade's matfile ftp://ion.le.ac.uk/matfile/matfile.tar.gz > > Example usage: > > x,y = transpose(readcol('test.dat')) # data in two columns > > X = readcol('test.dat') # a matrix of data > > x = readcol('test.dat') # a single column of data > > x = readcol('test.dat,'#') # the character use like a comment > delimiter is '#' > > initial function from pylab (J.Hunter). Change by myself for my > specific need > > """ > from numarray import array,transpose > > fh = file(fname) > > X = [] > numCols = None > nline = 0 > if columns is None: > for line in fh: > nline += 1 > if dep is not None and nline <= dep: continue > line = line[:line.find(comments)].strip() > if not len(line): continue > if arraytype=='numeric': > row = [float(val) for val in line.split(delimiter)] > else: > row = [val.strip() for val in line.split(delimiter)] > thisLen = len(row) > if numCols is not None and thisLen != numCols: > raise ValueError('All rows must have the same number of > columns') > X.append(row) > else: > for line in fh: > nline +=1 > if dep is not None and nline <= dep: continue > line = line[:line.find(comments)].strip() > if not len(line): continue > row = line.split(delimiter) > if arraytype=='numeric': > row = [float(row[i-1]) for i in columns] > elif arraytype=='numstring': > row = [row[i-1].strip() for i in columns] > else: > row = [row[i-1].strip() for i in columns] > thisLen = len(row) > if numCols is not None and thisLen != numCols: > raise ValueError('All rows must have the same number of > columns') > X.append(row) > > if arraytype=='numeric': > X = array(X) > r,c = X.shape > if r==1 or c==1: > X.shape = max([r,c]), > elif arraytype == 'numstring': > import numarray.strings # pb if numeric+pylab > X = numarray.strings.array(X) > r,c = X.shape > if r==1 or c==1: > X.shape = max([r,c]), > return X > > > ------------------------------------------- > files_test_creation.py > > ------------------------------------------- > > f1 = file('test1.dat','w') > for i in range(10000): > f1.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n') > f1.close() > > > f2 = file('test2.dat','w') > for i in range(10000): > f2.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n') > f2.close() > > ------------------------------------------- > numarray_pb_sample.py > > ------------------------------------------- > > import numarray > data1 = > readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='numstring') > data2 = > readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='numstring') > > #or in non string array form (same result) > ## data1 = > readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='numeric') > ## data2 = > readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='numeric') > > for a_i in range(data1.shape[0]): > for b_i in range(data2.shape[0]): > if numarray.all(data1[a_i,:] == data2[b_i,:]): > print a_i,b_i > > ------------------------------------------- > python_list_sample.py > > ------------------------------------------- > > data1 = > readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='list') > data2 = > readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter=' > ',dep=1,arraytype='list') > > for a_i in range(len(data1)): > for b_i in range(len(data2)): > if data1[a_i] == data2[b_i]: > print a_i,b_i > > > > > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. > Download it for free - -and be entered to win a 42" plasma tv or your > very > own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
|
From: Humufr <hu...@ya...> - 2005-09-20 15:44:03
|
Hello,
I have a problem with numarray and especially the function numarray.all.
I want to compare two files to do this I read the files with a function
readcol2 who can put them in a list or numarray format (string or
numerical).
I'm doing a comparaison on each line of the file.
If I'm using the array format and the numarray.all function, that take
forever to do the comparaison for 2 big files. If I'm using python list
object, it's very fast. I think there are some problem or at least some
improvement to do. If I understand correctly the goal of numarray, it
has been write to speed up some part of python but here it slow down a lot.
An very simple sample to see the effect is at the bottom of this mail.
Thanks for numarray, I hope to not bother you. My comments are more to
improve numarray than other things. I have been able to find the problem
so no I can avoied it.
H.
def
readcol(fname,comments='%',columns=None,delimiter=None,dep=0,arraytype='list'):
"""
Load ASCII data from fname into an array and return the array.
The data must be regular, same number of values in every row
fname can be a filename or a file handle.
Input:
- Fname : the name of the file to read
Optionnal input:
- comments : a string to indicate the charactor to delimit the domments.
the default is the matlab character '%'.
- columns : list or tuple ho contains the columns to use.
- delimiter : a string to delimit the columns
- dep : an integer to indicate from which line you want to begin
to use the file (useful to avoid the descriptions lines)
- arraytype : a string to indicate which kind of array you want ot
have: numeric array (numeric) or character array
(numstring) or list (list). By default it's the
list mode used
matfile data is not currently supported, but see
Nigel Wade's matfile ftp://ion.le.ac.uk/matfile/matfile.tar.gz
Example usage:
x,y = transpose(readcol('test.dat')) # data in two columns
X = readcol('test.dat') # a matrix of data
x = readcol('test.dat') # a single column of data
x = readcol('test.dat,'#') # the character use like a comment
delimiter is '#'
initial function from pylab (J.Hunter). Change by myself for my
specific need
"""
from numarray import array,transpose
fh = file(fname)
X = []
numCols = None
nline = 0
if columns is None:
for line in fh:
nline += 1
if dep is not None and nline <= dep: continue
line = line[:line.find(comments)].strip()
if not len(line): continue
if arraytype=='numeric':
row = [float(val) for val in line.split(delimiter)]
else:
row = [val.strip() for val in line.split(delimiter)]
thisLen = len(row)
if numCols is not None and thisLen != numCols:
raise ValueError('All rows must have the same number of
columns')
X.append(row)
else:
for line in fh:
nline +=1
if dep is not None and nline <= dep: continue
line = line[:line.find(comments)].strip()
if not len(line): continue
row = line.split(delimiter)
if arraytype=='numeric':
row = [float(row[i-1]) for i in columns]
elif arraytype=='numstring':
row = [row[i-1].strip() for i in columns]
else:
row = [row[i-1].strip() for i in columns]
thisLen = len(row)
if numCols is not None and thisLen != numCols:
raise ValueError('All rows must have the same number of
columns')
X.append(row)
if arraytype=='numeric':
X = array(X)
r,c = X.shape
if r==1 or c==1:
X.shape = max([r,c]),
elif arraytype == 'numstring':
import numarray.strings # pb if numeric+pylab
X = numarray.strings.array(X)
r,c = X.shape
if r==1 or c==1:
X.shape = max([r,c]),
return X
-------------------------------------------
files_test_creation.py
-------------------------------------------
f1 = file('test1.dat','w')
for i in range(10000):
f1.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n')
f1.close()
f2 = file('test2.dat','w')
for i in range(10000):
f2.write(str(i)+' '+str(i+1)+' '+str(i+2)+'\n')
f2.close()
-------------------------------------------
numarray_pb_sample.py
-------------------------------------------
import numarray
data1 =
readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='numstring')
data2 =
readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='numstring')
#or in non string array form (same result)
## data1 =
readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='numeric')
## data2 =
readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='numeric')
for a_i in range(data1.shape[0]):
for b_i in range(data2.shape[0]):
if numarray.all(data1[a_i,:] == data2[b_i,:]):
print a_i,b_i
-------------------------------------------
python_list_sample.py
-------------------------------------------
data1 =
readcol2.readcol('test1.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='list')
data2 =
readcol2.readcol('test2.dat',columns=[1,2,3],comments='#',delimiter='
',dep=1,arraytype='list')
for a_i in range(len(data1)):
for b_i in range(len(data2)):
if data1[a_i] == data2[b_i]:
print a_i,b_i
|
|
From: Steven H. R. <st...@sh...> - 2005-09-18 17:49:53
|
Combining Numeric and Zope is a neat idea and I've speculated about how to do it, but haven't got beyond that. I believe that you'd have to write a new Zope product that called the Numeric package. You might get a more informed response on the Zope, Plone, or SciPy mailing lists. Regards, Steve shashank karnik wrote: > > Hello everyone > > Can anyone help me to install Numeric extension or package for my Zope > server 2.4 version running on Windows Xp? > > You see i am a beginner at python and zope both and need to use a > product in Zope(GNOWSYS) which requires Python -Numeric and > Python-XMLBase packages.. > > I downloaded the Numeric py package for Windows from this link > > http://prdownloads.sourceforge.net/numpy/Numeric-23.8.win32-py2.4.exe?download > > However...i cant figure out how to install it so that my Zope server > recognises it. > > The installer currently just installs the package so that it is > recognised by the Python interpreter...but the Zope server is stored in > Program Files on my machine..how do i make it understand that the > Numeric package has been installed.. > > I tried putting the numeric folder in this path > C:\Program Files\Zope\lib\python\Products > > But it doesnt work > > Please help me out > > If think that this question should be asked on a Zope forum...please > let me know! > > Thank you > > > > <http://adworks.rediff.com/cgi-bin/AdWorks/sigclick.cgi/www.rediff.com/signature-home.htm/1507191490@Middle5?PARTNER=3> > -- Steven H. Rogers, Ph.D., st...@sh... Weblog: http://shrogers.com/weblog "He who refuses to do arithmetic is doomed to talk nonsense." -- John McCarthy |
|
From: shashank k. <sha...@re...> - 2005-09-18 06:33:46
|
=0AHello everyone=0A=0ACan anyone help me to install Numeric extension or= package for my Zope server 2.4 version running on Windows Xp?=0A=0AYou see= i am a beginner at python and zope both and need to use a product in Zope(= GNOWSYS) which requires Python -Numeric and Python-XMLBase packages..=0A=0A= I downloaded the Numeric py package for Windows from this link=0A=0Ahttp://= prdownloads.sourceforge.net/numpy/Numeric-23.8.win32-py2.4.exe?download=0A= =0AHowever...i cant figure out how to install it so that my Zope server rec= ognises it.=0A=0AThe installer currently just installs the package so that = it is recognised by the Python interpreter...but the Zope server is stored = in Program Files on my machine..how do i make it understand that the Numeri= c package has been installed..=0A=0AI tried putting the numeric folder in t= his path =0AC:\Program Files\Zope\lib\python\Products=0A=0ABut it doesnt wo= rk=0A=0APlease help me out=0A=0AIf think that this question should be aske= d on a Zope forum...please let me know!=0A=0AThank you=0A=0A |
|
From: Fernando P. <Fer...@co...> - 2005-09-17 05:11:33
|
LUK ShunTim wrote: > Fernando Perez wrote: > >>LUK ShunTim wrote: > Thanks very much. However no luck. :-( I now got this error > > >>svn: PROPFIND of '/svn/scipy_core/branches/newcore': 405 Method Not Allowed (https://svn.scipy.org) That's what I said in the message immediately afterward, because I hit send too soon: that the https:// approach would NOT work with scipy. You need to have your proxy fixed, I'm afraid. Cheers, f |
|
From: LUK S. <shu...@po...> - 2005-09-17 04:59:30
|
Fernando Perez wrote: > LUK ShunTim wrote: > > >>> I got this time out error when I tried, several times. :-( >>> >>> svn: REPORT request failed on '/svn/scipy_core/!svn/vcc/default' >>> svn: REPORT of '/svn/scipy_core/!svn/vcc/default': Could not read status >>> line: Connection timed out (http://svn.scipy.org) >>> >>> Please see if this is an server configuration issue. > > > No, it's an issue with your setup, not something on scipy's side. > > You are behind a proxy blocking REPORT requests. See this for details: > > http://www.sipfoundry.org/tools/svn-tips.html > > which says: > > What does 'REPORT request failed' mean? > > When I try to check out a subversion repository > > > svn co http://scm.sipfoundry.org/rep/project/main project > > I get an error like: > > svn: REPORT request failed on '/rep/project/!svn/vcc/default' > svn: REPORT of '/rep/project/!svn/vcc/default': 400 Bad Request > (http://scm.sipfoundry.org) > > You are behind a web proxy that is not passing the WebDAV methods that > subversion uses. You can work around the problem by using SSL to hide what > you're doing from the proxy: > > > svn co https://scm.sipfoundry.org/rep/project/main project > > > Cheers, > > f Thanks very much. However no luck. :-( I now got this error > svn: PROPFIND of '/svn/scipy_core/branches/newcore': 405 Method Not Allowed (https://svn.scipy.org) with the suggested workaround. I guess I'll learn a bit about svn and may be have to take it up with our system admin. In the mean time, is CVS still available? BTW, perhaps it might help people like me who are not familiar with svn by putting this tip somethere in the download page. Thanks again, ST -- |
|
From: Fernando P. <Fer...@co...> - 2005-09-16 15:49:02
|
Fernando Perez wrote: > You are behind a web proxy that is not passing the WebDAV methods that > subversion uses. You can work around the problem by using SSL to hide what > you're doing from the proxy: > > > svn co https://scm.sipfoundry.org/rep/project/main project I forgot to add that the https method will NOT work with scipy, which doesn't provide svn/ssl support. You need to fix your proxy config. Cheers, f |
|
From: Fernando P. <Fer...@co...> - 2005-09-16 15:45:08
|
LUK ShunTim wrote: >> I got this time out error when I tried, several times. :-( >> >> svn: REPORT request failed on '/svn/scipy_core/!svn/vcc/default' >> svn: REPORT of '/svn/scipy_core/!svn/vcc/default': Could not read status >> line: Connection timed out (http://svn.scipy.org) >> >> Please see if this is an server configuration issue. No, it's an issue with your setup, not something on scipy's side. You are behind a proxy blocking REPORT requests. See this for details: http://www.sipfoundry.org/tools/svn-tips.html which says: What does 'REPORT request failed' mean? When I try to check out a subversion repository > svn co http://scm.sipfoundry.org/rep/project/main project I get an error like: svn: REPORT request failed on '/rep/project/!svn/vcc/default' svn: REPORT of '/rep/project/!svn/vcc/default': 400 Bad Request (http://scm.sipfoundry.org) You are behind a web proxy that is not passing the WebDAV methods that subversion uses. You can work around the problem by using SSL to hide what you're doing from the proxy: > svn co https://scm.sipfoundry.org/rep/project/main project Cheers, f |
|
From: LUK S. <shu...@po...> - 2005-09-16 05:50:00
|
Travis Oliphant wrote: > > This is to officially announce that the new replacement for Numeric > (scipy_core) is available at SVN. Read permission is open to everyone > so a simple checkout: > > svn co http://svn.scipy.org/svn/scipy_core/branches/newcore newcore > should get you the distribution that should install with > I got this time out error when I tried, several times. :-( svn: REPORT request failed on '/svn/scipy_core/!svn/vcc/default' svn: REPORT of '/svn/scipy_core/!svn/vcc/default': Could not read status line: Connection timed out (http://svn.scipy.org) Please see if this is an server configuration issue. Regards, ST -- |
|
From: Travis O. <oli...@ee...> - 2005-09-15 19:14:39
|
This is to officially announce that the new replacement for Numeric (scipy_core) is available at SVN. Read permission is open to everyone so a simple checkout: svn co http://svn.scipy.org/svn/scipy_core/branches/newcore newcore should get you the distribution that should install with cd newcore python setup.py install I'm in the process of adding the linear algebra routines, fft, random, and dotblas from Numeric. This should be done by the conference. I will make a windows binary release for the SciPy conference, but not before then. There is a script in newcore/scipy/base/convertcode.py that will take code written for Numeric (or numerix) and convert it to code for the new scipy base object. This code is not foolproof, but it takes care of the minor incompatibilities (a few search and replaces are done). The compatibility issues are documented (mostly in the typecode characters and a few method name changes). The one bigger incompatibility is that a.flat does something a little different (a 1-d iterator object). The convert code script changes uses of a.flat that are not indexing or set attribute related to a.ravel() C-code should build for the new system with a change of #include Numeric/arrayobject.h to #include scipy/arrayobject.h --- though you may want to enhance your code to take advantage of the new features (and more extensive C-API). I also still need to add the following ufuncs: isnan, isfinite, signbit, isinf, frexp, and ldexp. This should not take too long. -Travis O. |
|
From: Pearu P. <pe...@sc...> - 2005-09-14 19:51:37
|
On Wed, 14 Sep 2005, Travis Oliphant wrote: > Now that the new scipy.base (that can replace Numeric) is pretty much > complete, I'm working on > bringing the other parts of Numeric along so that scipy_core can replace > Numeric (and numarray in functionality) for all users. > > I'm now using a branch of scipy_core to do this work. The old Numeric3 CVS > directory on sourceforge will start to wither... > > The branch is at > > http://svn.scipy.org/svn/scipy_core/branches/newcore > > I'm thinking about how to structure the new scipy_core. > Right now under the new scipy_core we have > > Hierarchy Imports as > ==================== > base/ --> scipy.base (namespace also available under > scipy itself) > distutils/ --> scipy.distutils > test/ --> scipy.test > weave/ --> weave > > We need to bring over basic linear algebra, statistics, and fft's from > Numeric. So where do we put them and how do they import? I have done some work in this direction but have not commited to repository yet because it needs more testing. Basically, (not commited) scipy.distutils has support to build Fortran or f2c'd versions of various libraries (currently I have tested it on blas) depending on whether Fortran compiler is available or not. > Items to consider: > > * the basic functionality will be expanded / replaced by anybody who > installs the entire scipy library. > > * are we going to get f2py to live in scipy_core (I say yes)... That would simplify many things, so I'd also say yes. On the other hand, I have not decided what to do with f2py2e CVS repository. Suggestions are welcome (though I understand that this might be my personal problem). > * I think scipy_core should install a working basic scipy (i.e. import scipy > as Numeric) should work and be an effective replacement for import Numeric). > Of course the functionality will be quite a bit less than if full scipy was > installed, but basic functions should still work. > > With that in mind I propose the additions > > Hiearchy Imports as > ========================== > corelib/lapack_lite/ --> scipy.lapack_lite corelib/fftpack_lite/ --> > scipy.fftpack_lite > corelib/random_lite/ --> scipy.random_lite > linalg/ --> scipy.linalg > fftpack/ --> scipy.fftpack > stats/ --> scipy.stats > > Users would typically use only the functions in scipy.linalg, scipy.fftpack, > and scipy.stats. > > Notice that scipy also has modules names linalg, fftpack, and stats. These > would add / replace functionality available in the basic core system. Since lapack_lite, fftpack_lite can be copied from Numeric then there's no rush for me to commit my scipy.distutils work, I guess. I'll do that when it is more or less stable and then we can gradually apply f2c to various scipy modules that currently have fortran sources which would allow compiling the whole scipy without having fortran compiler around. Pearu |
|
From: Travis O. <oli...@ee...> - 2005-09-14 19:33:46
|
Now that the new scipy.base (that can replace Numeric) is pretty much complete, I'm working on bringing the other parts of Numeric along so that scipy_core can replace Numeric (and numarray in functionality) for all users. I'm now using a branch of scipy_core to do this work. The old Numeric3 CVS directory on sourceforge will start to wither... The branch is at http://svn.scipy.org/svn/scipy_core/branches/newcore I'm thinking about how to structure the new scipy_core. Right now under the new scipy_core we have Hierarchy Imports as ==================== base/ --> scipy.base (namespace also available under scipy itself) distutils/ --> scipy.distutils test/ --> scipy.test weave/ --> weave We need to bring over basic linear algebra, statistics, and fft's from Numeric. So where do we put them and how do they import? Items to consider: * the basic functionality will be expanded / replaced by anybody who installs the entire scipy library. * are we going to get f2py to live in scipy_core (I say yes)... * I think scipy_core should install a working basic scipy (i.e. import scipy as Numeric) should work and be an effective replacement for import Numeric). Of course the functionality will be quite a bit less than if full scipy was installed, but basic functions should still work. With that in mind I propose the additions Hiearchy Imports as ========================== corelib/lapack_lite/ --> scipy.lapack_lite corelib/fftpack_lite/ --> scipy.fftpack_lite corelib/random_lite/ --> scipy.random_lite linalg/ --> scipy.linalg fftpack/ --> scipy.fftpack stats/ --> scipy.stats Users would typically use only the functions in scipy.linalg, scipy.fftpack, and scipy.stats. Notice that scipy also has modules names linalg, fftpack, and stats. These would add / replace functionality available in the basic core system. Comments, -Travis O. |
|
From: Francesc A. <fa...@ca...> - 2005-09-14 12:37:44
|
========================== Announcing PyTables 1.1.1 ========================== This is a maintenance release of PyTables. In it, several optimizations and bug fixes have been made. As some of the fixed bugs were quite important, it's strongly recommended for users to upgrade. Go to the PyTables web site for downloading the beast: http://pytables.sourceforge.net/ or keep reading for more info about the improvements and bugs fixed. Changes more in depth ===================== Improvements: - Optimized the opening of files with a large number of objects. Now, files with table objects open a 50% faster, and files with arrays open more than twice as fast (up to 2000 objects/s on a Pentium 4@2GHz). Hence, a file with a combination of both kinds of objects opens between a 50% and 100% faster than in 1.1. - Optimized the creation of ``NestedRecArray`` objects using ``NumArray`` objects as columns, so that filling a table with the ``Table.append()`` method achieves a performance similar to PyTables pre-1.1 releases. Bug fixes: - ``Table.readCoordinates()`` now converts the coords parameter into ``Int64`` indices automatically. - Fixed a bug that prevented appending to tables (though ``Table.append()``) using a list of ``NumArray`` objects. - ``Int32`` attributes are handled correctly in 64-bit platforms now. - Correction for accepting lists of numarrays as input for ``NestedRecArrays``. - Fixed a problem when creating rank 1 multi-dimensional string columns in ``Table`` objects. Closes SF bug #1269023. - Avoid errors when unpickling objects stored in attributes. See the section ``AttributeSet`` in the reference chapter of the User's Manual for more information. Closes SF bug #1254636. - Assignment for ``*Array`` slices has been improved in order to solve some issues with shapes. Closes SF bug #1288792. - The indexation properties were lost in case the table was closed before an index was created. Now, these properties are saved even in this case. Known bugs: - Classes inheriting from ``IsDescription`` subclasses do not inherit columns defined in the super-class. See SF bug #1207732 for more info. - Time datatypes are non-portable between big-endian and little-endian architectures. This is ultimately a consequence of a HDF5 limitation. See SF bug #1234709 for more info. Backward-incompatible changes: - None (that we are aware of). Important note for MacOSX users =============================== UCL compressor works badly on MacOSX platforms. Recent investigation seems to point to a bug in the development tools in MacOSX. Until the problem is isolated and eventually solved, UCL support will not be compiled by default on MacOSX platforms, even if the installer finds it in the system. However, if you still want to get UCL support on MacOSX, you can use the ``--force-ucl`` flag in ``setup.py``. Important note for Python 2.4 and Windows users =============================================== If you are willing to use PyTables with Python 2.4 in Windows platforms, you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET 2003. It can be found at: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-164-win-net.ZIP Users of Python 2.3 on Windows will have to download the version of HDF5 compiled with MSVC 6.0 available in: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-164-win.ZIP What it is ========== **PyTables** is a package for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data (with support for full 64-bit file addressing). It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code, makes it a very easy-to-use tool for high performance data storage and retrieval. PyTables runs on top of the HDF5 library and numarray (Numeric is also supported) package for achieving maximum throughput and convenient use. Besides, PyTables I/O for table objects is buffered, implemented in C and carefully tuned so that you can reach much better performance with PyTables than with your own home-grown wrappings to the HDF5 library. PyTables sports indexing capabilities as well, allowing doing selections in tables exceeding one billion of rows in just seconds. Platforms ========= This version has been extensively checked on quite a few platforms, like Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64 (Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC and MacOSX on PowerPC. For other platforms, chances are that the code can be easily compiled and run without further problems. Please, contact us in case you are experiencing problems. Resources ========= Go to the PyTables web site for more details: http://pytables.sourceforge.net/ About the HDF5 library: http://hdf.ncsa.uiuc.edu/HDF5/ About numarray: http://www.stsci.edu/resources/software_hardware/numarray To know more about the company behind the PyTables development, see: http://www.carabos.com/ Acknowledgments =============== Thanks to various the users who provided feature improvements, patches, bug reports, support and suggestions. See THANKS file in distribution package for a (incomplete) list of contributors. Many thanks also to SourceForge who have helped to make and distribute this package!. And last but not least, a big thanks to THG (http://www.hdfgroup.org/) for sponsoring many of the new features recently introduced in PyTables. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team |
|
From: Nadav H. <Na...@Vi...> - 2005-09-14 11:13:35
|
It seems that the tostring method fails on rank 0 arrays:
a = N.array(-4)
>>> a
array(-4)
>>> a.tostring()
Traceback (most recent call last):
File "<pyshell#18>", line 1, in -toplevel-
a.tostring()
File "/usr/local/lib/python2.4/site-packages/numarray/generic.py",
line 746, in tostring
self._strides, self._itemsize)
MemoryError
>>> N.__version__
'1.4.0'
Nadav.
|
|
From: Francesc A. <fa...@ca...> - 2005-09-12 13:01:49
|
El dv 09 de 09 del 2005 a les 22:41 +0200, en/na Joost van Evert va
escriure:
> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
> Thanks, this helps me, but I think not enough, because the arrays I work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
[mode advertising on, be warned <wink>]
You may want to use pytables [1]. It supports on-line data compression
and access to data on-disk on a similar way than memory-mapped arrays.
Example of use:
In [66]:f=3Dtables.openFile("/tmp/test-zlib.h5","w")
In [67]:fzlib=3Dtables.Filters(complevel=3D1, complib=3D"zlib") # the filte=
r
In [68]:chunk=3Dtables.Float64Atom(shape=3D(50,50)) # the data 'chunk'
In [69]:carr=3Df.createCArray(f.root, "carr",(1000, 1000),chunk,'',fzlib)
In [70]:carr[:]=3Dnumarray.random_array.random((1000,1000))
In [71]:f.close()
In [72]:ls -l /tmp/test-zlib.h5
-rw-r--r-- 1 faltet users 3680721 2005-09-12 14:27 /tmp/test-zlib.h5
Now, you can access the data on disk as if it was in-memory:
In [73]:f=3Dtables.openFile("/tmp/test-zlib.h5","r")
In [74]:f.root.carr[300,200]
Out[74]:0.76497000455856323
In [75]:f.root.carr[300:310:3,900:910:2]
Out[75]:
array([[ 0.5336495 , 0.55542123, 0.80049258, 0.84423071, 0.47674203],
[ 0.93104523, 0.71216697, 0.23955345, 0.89759707, 0.70620197],
[ 0.86999339, 0.05541291, 0.55156851, 0.96808773, 0.51768076],
[ 0.29315394, 0.03837755, 0.33675179, 0.93591529, 0.99721605]])
Also, access to disk is very fast, even if you compressed your data:
In [77]:tzlib=3Dtimeit.Timer("carr[300:310:3,900:910:2]","import
tables;f=3Dtables.openFile('/tmp/test-zlib.h5');carr=3Df.root.carr")
In [78]:tzlib.repeat(3,100)
Out[78]:[0.204339981079101, 0.176630973815917, 0.177133798599243]
Compare these times with non-compressed data:
In [80]:tnc=3Dtimeit.Timer("carr[300:310:3,900:910:2]","import
tables;f=3Dtables.openFile('/tmp/test-nocompr.h5');carr=3Df.root.carr")
In [81]:tnc.repeat(3,100)
Out[81]:[0.089105129241943, 0.084129095077514, 0.084383964538574219]
That means that pytables can access data in the middle of a dataset
without decompressing all the dataset, but just the interesting chunks
(and you can decide the size of these chunks). You can see how the
access times are in the range of milliseconds, irregardingly of the fact
that the data is compressed or not.
PyTables also does support others compressors apart from zlib, like
bzip2 [2] or LZO [3], as well as compression pre-conditioners, like
shuffle [4]. Look at the compression ratios for completely random data:
In [84]:ls -l /tmp/test*.h5
-rw-r--r-- 1 faltet users 3675874 /tmp/test-bzip2-shuffle.h5
-rw-r--r-- 1 faltet users 3680615 /tmp/test-zlib-shuffle.h5
-rw-r--r-- 1 faltet users 3777749 /tmp/test-lzo-shuffle.h5
-rw-r--r-- 1 faltet users 8025024 /tmp/test-nocompr.h5
LZO is specially interesting if you want fast access to your data (it's
very fast decompressing):
In [82]:tlzo=3Dtimeit.Timer("carr[300:310:3,900:910:2]","import
tables;f=3Dtables.openFile('/tmp/test-lzo-shuffle.h5');carr=3Df.root.carr")
In [83]:tlzo.repeat(3,100)
Out[83]:[0.12332820892333984, 0.11892890930175781, 0.12009191513061523]
So, retrieving compressed data using LZO is just 45% slower than if not
using compression. You can see more exhaustive benchmarks and discussion
in [5].
[1] http://www.pytables.org
[2] http://www.bzip2.org
[3] http://www.oberhumer.com/opensource/lzo
[4] http://hdf.ncsa.uiuc.edu/HDF5/doc_resource/H5Shuffle_Perf.pdf
[5] http://pytables.sourceforge.net/html-doc/usersguide6.html#section6.3
Uh, sorry by the blurb, but benchmarking is a lot of fun.
--=20
>0,0< Francesc Altet http://www.carabos.com/
V V C=E1rabos Coop. V. Enjoy Data
"-"
|