|
From: Alvaro T. C. <al...@mi...> - 2012-12-05 18:56:00
|
My system was benched for reads and writes with Blosc[1]:
with pt.openFile(paths.braw(block), 'r') as handle:
pt.setBloscMaxThreads(1)
%timeit a = handle.root.raw.c042[:]
pt.setBloscMaxThreads(6)
%timeit a = handle.root.raw.c042[:]
pt.setBloscMaxThreads(11)
%timeit a = handle.root.raw.c042[:]
print handle.root.raw._v_attrs.FILTERS
print handle.root.raw.c042.__sizeof__()
print handle.root.raw.c042
gives
1 loops, best of 3: 483 ms per loop
1 loops, best of 3: 782 ms per loop
1 loops, best of 3: 663 ms per loop
Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False)
104
/raw/c042 (CArray(303390000,), shuffle, blosc(5)) ''
I can't understand what is going on, for the life of me. These datasets use
int16 atoms and at Blosc complevel=5 used to compress by a factor of about
2. Even for such low compression ratios there should be huge differences
between single- and multi-threaded reads.
Do you have any clue?
-á.
[1] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks (first two
plots)
|
|
From: Francesc A. <fa...@gm...> - 2012-12-06 11:49:26
|
On 12/5/12 7:55 PM, Alvaro Tejero Cantero wrote:
> My system was benched for reads and writes with Blosc[1]:
>
> with pt.openFile(paths.braw(block), 'r') as handle:
> pt.setBloscMaxThreads(1)
> %timeit a = handle.root.raw.c042[:]
> pt.setBloscMaxThreads(6)
> %timeit a = handle.root.raw.c042[:]
> pt.setBloscMaxThreads(11)
> %timeit a = handle.root.raw.c042[:]
> print handle.root.raw._v_attrs.FILTERS
> print handle.root.raw.c042.__sizeof__()
> print handle.root.raw.c042
>
> gives
>
> 1 loops, best of 3: 483 ms per loop
> 1 loops, best of 3: 782 ms per loop
> 1 loops, best of 3: 663 ms per loop
> Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False)
> 104
> /raw/c042 (CArray(303390000,), shuffle, blosc(5)) ''
>
> I can't understand what is going on, for the life of me. These
> datasets use int16 atoms and at Blosc complevel=5 used to compress by
> a factor of about 2. Even for such low compression ratios there should
> be huge differences between single- and multi-threaded reads.
>
> Do you have any clue?
Yeah, welcome to the wonderful art of fine tuning. Fortunately we have
a machine which is pretty identical to yours (hey, your computer was too
good in Blosc benchmarks so as to ignore it :), so I can reproduce your
issue:
In [3]: a = ((np.random.rand(3e8))*100).astype('i2')
In [4]: f = tb.openFile("test.h5", "w")
In [5]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape,
filters=tb.Filters(5, complib="blosc"))
In [6]: act[:] = a
In [7]: f.flush()
In [8]: ll test.h5
-rw-rw-r-- 1 faltet 301719914 Dec 6 04:55 test.h5
This random set of numbers is close to your array in size (~3e8
elements), and also has a similar compression factor (~2x). Now the
timings (using 6 cores by default):
In [9]: timeit act[:]
1 loops, best of 3: 441 ms per loop
In [11]: tb.setBloscMaxThreads(1)
Out[11]: 6
In [12]: timeit act[:]
1 loops, best of 3: 347 ms per loop
So yeah, that might seem a bit disappointing. It turns out that the
default chunksize for PyTables is tuned so as to balance among
sequential and random reads. If what you want is to optimize only for
sequential reads (apparently this is what you are after, right?), then
it normally helps to increase the chunksize. For example, by doing some
quick trials, I determined that a chunksize of 2 MB is pretty optimal
for sequential access:
In [44]: f.removeNode(f.root.act)
In [45]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape,
filters=tb.Filters(5, complib="blosc"), chunkshape=(2**20,))
In [46]: act[:] = a
In [47]: tb.setBloscMaxThreads(1)
Out[47]: 6
In [48]: timeit act[:]
1 loops, best of 3: 334 ms per loop
In [49]: tb.setBloscMaxThreads(3)
Out[49]: 1
In [50]: timeit act[:]
1 loops, best of 3: 298 ms per loop
In [51]: tb.setBloscMaxThreads(6)
Out[51]: 3
In [52]: timeit act[:]
1 loops, best of 3: 303 ms per loop
Also, we see here that the sweet point is using 3 threads, not more
(don't ask why). However, that does not mean that Blosc is not able to
work faster on this machine, and in fact it does:
In [59]: import blosc
In [60]: sa = a.tostring()
In [61]: ac2 = blosc.compress(sa, 2, clevel=5)
In [62]: blosc.set_nthreads(6)
Out[62]: 6
In [64]: timeit a2 = blosc.decompress(ac2)
10 loops, best of 3: 80.7 ms per loop
In [65]: blosc.set_nthreads(1)
Out[65]: 6
In [66]: timeit a2 = blosc.decompress(ac2)
1 loops, best of 3: 249 ms per loop
So that means that a pure Blosc compression in-memory can only go 4x
faster than PyTables + Blosc, and in this is case the latter is reaching
an excellent mark of 2 GB/s, which is really good for a read from disk
operation. Note how a memcpy() operation in this machine is just about
as good as this:
In [36]: timeit a.copy()
1 loops, best of 3: 294 ms per loop
Now that I'm on this, I'm curious on how other compressors would perform
for this scenario:
In [6]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape,
filters=tb.Filters(5, complib="lzo"), chunkshape=(2**20,))
In [7]: act[:] = a
In [8]: f.flush()
In [9]: ll test.h5 # compression ratio very close to Blosc
-rw-rw-r-- 1 faltet 302769510 Dec 6 05:23 test.h5
In [10]: timeit act[:]
1 loops, best of 3: 1.13 s per loop
so, the time for LZO is more than 3x slower than Blosc. And a similar
thing with zlib:
In [12]: f.close()
In [13]: f = tb.openFile("test.h5", "w")
In [14]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape,
filters=tb.Filters(1, complib="zlib"), chunkshape=(2**20,))
In [15]: act[:] = a
In [16]: f.flush()
In [17]: ll test.h5 # the compression rate is somewhat better
-rw-rw-r-- 1 faltet 254821296 Dec 6 05:26 test.h5
In [18]: timeit act[:]
1 loops, best of 3: 2.24 s per loop
which is 6x slower than Blosc (although the compression ratio is a bit
better).
And just for matter of completeness, let's see how fast can perform
carray (the package, not the CArray object in PyTables) for a chunked
array in-memory:
In [19]: import carray as ca
In [20]: ac3 = ca.carray(a, chunklen=2**20, cparams=ca.cparams(5))
In [21]: ac3
Out[21]:
carray((300000000,), int16)
nbytes: 572.20 MB; cbytes: 289.56 MB; ratio: 1.98
cparams := cparams(clevel=5, shuffle=True)
[59 34 36 ..., 21 58 50]
In [22]: timeit ac3[:]
1 loops, best of 3: 254 ms per loop
In [23]: ca.set_nthreads(1)
Out[23]: 6
In [24]: timeit ac3[:]
1 loops, best of 3: 282 ms per loop
So, with 254 ms, it is only marginally faster than PyTables (~298 ms).
Now with a carray object on-disk:
In [27]: acd = ca.carray(a, chunklen=2**20, cparams=ca.cparams(5),
rootdir="test")
In [28]: acd
Out[28]:
carray((300000000,), int16)
nbytes: 572.20 MB; cbytes: 289.56 MB; ratio: 1.98
cparams := cparams(clevel=5, shuffle=True)
rootdir := 'test'
[59 34 36 ..., 21 58 50]
In [30]: ca.set_nthreads(6)
Out[30]: 1
In [31]: timeit acd[:]
1 loops, best of 3: 317 ms per loop
In [32]: ca.set_nthreads(1)
Out[32]: 6
In [33]: timeit acd[:]
1 loops, best of 3: 361 ms per loop
The times in this case are a bit larger than with PyTables (317ms vs
298ms), which speaks a lot how efficiently is implemented I/O in
HDF5/PyTables stack.
--
Francesc Alted
|
|
From: Alvaro T. C. <al...@mi...> - 2012-12-06 12:42:57
|
Thank you for the comprehensive round-up. I have some ideas and reports
below.
What about ctables? The documentation says that it is specificly
column-access optimized, which is what I need in this scenario (sometimes
sequential, sometimes random).
Unfortunately I could not get the rootdir parameter for ctables __init__ to
work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads to compilation
errors.
This is the ctables-to-disk error:
ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
rootdir='/tmp/ctable2.ctable')
---------------------------------------------------------------------------TypeError
Traceback (most recent call
last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b>
in <module>()----> 1 ct2 = ca.ctable((np.arange(30000000),),
names=('range2',), rootdir='/tmp/ctable2.ctable')
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
in __init__(self, cols, names, **kwargs) 158 if
column.dtype == np.void: 159 raise ValueError,
"`cols` elements cannot be of type void"--> 160 column
= ca.carray(column, **kwargs) 161 elif ratype: 162
column = ca.carray(cols[name], **kwargs)
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
in carray.carrayExtension.carray.__cinit__
(carray/carrayExtension.c:3917)()
TypeError: __cinit__() got an unexpected keyword argument 'rootdir'
And this is cut from the pip output when trying to upgrade carray.
gcc: carray/carrayExtension.c
gcc: error: carray/carrayExtension.c: No such file or directory
Two more notes:
* a way was added to check in-disk (compressed) vs in-memory (uncompressed)
node sizes. I was unable to find the way to use it either from the 2.4.0
release notes or from the git issue
https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
* is/will it be possible to load PyTables carrays as in-memory carrays
without decompression?
Best,
Álvaro
On 6 December 2012 11:49, Francesc Alted <fa...@gm...> wrote:
> completeness, let's see how fast can perform
> carray (the package, n
>
|
|
From: Alvaro T. C. <al...@mi...> - 2012-12-06 18:30:02
|
I'll answer myself on the size-checking: the right attributes are Leaf.size_in_memory and Leaf.size_on_disk (per http://pytables.github.com/usersguide/libref/hierarchy_classes.html) -á. On 6 December 2012 12:42, Alvaro Tejero Cantero <al...@mi...> wrote: > Thank you for the comprehensive round-up. I have some ideas and reports > below. > > What about ctables? The documentation says that it is specificly > column-access optimized, which is what I need in this scenario (sometimes > sequential, sometimes random). > > Unfortunately I could not get the rootdir parameter for ctables __init__ > to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads to compilation > errors. > > This is the ctables-to-disk error: > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > rootdir='/tmp/ctable2.ctable') > > ---------------------------------------------------------------------------TypeError Traceback (most recent call last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> in <module>()----> 1 ct2 = ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable') > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in __init__(self, cols, names, **kwargs) 158 if column.dtype == np.void: 159 raise ValueError, "`cols` elements cannot be of type void"--> 160 column = ca.carray(column, **kwargs) 161 elif ratype: 162 column = ca.carray(cols[name], **kwargs) > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so in carray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)() > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > > And this is cut from the pip output when trying to upgrade carray. > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory > > > > Two more notes: > > * a way was added to check in-disk (compressed) vs in-memory > (uncompressed) node sizes. I was unable to find the way to use it either > from the 2.4.0 release notes or from the git issue > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > * is/will it be possible to load PyTables carrays as in-memory carrays > without decompression? > > Best, > > Álvaro > > > > On 6 December 2012 11:49, Francesc Alted <fa...@gm...> wrote: > >> completeness, let's see how fast can perform >> carray (the package, n >> > > |
|
From: Francesc A. <fa...@gm...> - 2012-12-07 12:47:12
|
On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote:
> Thank you for the comprehensive round-up. I have some ideas and
> reports below.
>
> What about ctables? The documentation says that it is specificly
> column-access optimized, which is what I need in this scenario
> (sometimes sequential, sometimes random).
Yes, ctables is optimized for column access.
>
> Unfortunately I could not get the rootdir parameter for ctables
> __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads
> to compilation errors.
Yep, persistence for carray/ctables objects was added in 0.5.
>
> This is the ctables-to-disk error:
>
> ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
> rootdir='/tmp/ctable2.ctable')
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> in<module>()
> ----> 1 ct2= ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable')
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in__init__(self, cols, names, **kwargs)
> 158 if column.dtype== np.void:
> 159 raise ValueError, "`cols` elements cannot be of type void"
> --> 160 column= ca.carray(column, **kwargs)
> 161 elif ratype:
> 162 column= ca.carray(cols[name], **kwargs)
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)()
>
> TypeError: __cinit__() got an unexpected keyword argument 'rootdir'
>
>
> And this is cut from the pip output when trying to upgrade carray.
>
> gcc: carray/carrayExtension.c
>
> gcc: error: carray/carrayExtension.c: No such file or directory
Hmm, that's strange, because the carrayExtension should have been
cythonized automatically. Here it is part of my install process with pip:
Running setup.py install for carray
* Found Cython 0.17.1 package installed.
* Found numpy 1.7.0b2 package installed.
* Found numexpr 2.0.1 package installed.
cythoning carray/carrayExtension.pyx to carray/carrayExtension.c
building 'carray.carrayExtension' extension
C compiler: gcc -fno-strict-aliasing
-I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3
-Wall -Wstrict-prototypes
Hmm, perhaps you need a newer version of Cython?
>
>
> Two more notes:
>
> * a way was added to check in-disk (compressed) vs in-memory
> (uncompressed) node sizes. I was unable to find the way to use it
> either from the 2.4.0 release notes or from the git issue
> https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
You already found the answer.
>
> * is/will it be possible to load PyTables carrays as in-memory carrays
> without decompression?
Actually, that has been my idea from the very beginning. The concept of
'flavor' for the returned objects when reading is already there, so it
should be relatively easy to add a new 'carray' flavor. Maybe you can
contribute this?
--
Francesc Alted
|
|
From: Alvaro T. C. <al...@mi...> - 2012-12-07 16:30:31
|
I have now similar dependencies as you, except for Numpy 1.7 beta 2.
I wish I could help with the carray flavor.
--
Running setup.py install for carray
* Found Cython 0.17.2 package installed.
* Found numpy 1.6.2 package installed.
* Found numexpr 2.0.1 package installed.
building 'carray.carrayExtension' extension
C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv
-DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
-D_GNU_SOURCE -fPIC -fwrapv -fPIC
compile options: '-Iblosc
-I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/python2.7 -c'
extra options: '-msse2'
gcc: blosc/blosclz.c
gcc: carray/carrayExtension.c
gcc: error: carray/carrayExtension.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
gcc: error: carray/carrayExtension.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv
-DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc
-I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/python2.7 -c carray/carrayExtension.c -o
build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed with
exit status 4
-á.
On 7 December 2012 12:47, Francesc Alted <fa...@gm...> wrote:
> On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote:
> > Thank you for the comprehensive round-up. I have some ideas and
> > reports below.
> >
> > What about ctables? The documentation says that it is specificly
> > column-access optimized, which is what I need in this scenario
> > (sometimes sequential, sometimes random).
>
> Yes, ctables is optimized for column access.
>
> >
> > Unfortunately I could not get the rootdir parameter for ctables
> > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads
> > to compilation errors.
>
> Yep, persistence for carray/ctables objects was added in 0.5.
>
> >
> > This is the ctables-to-disk error:
> >
> > ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
> > rootdir='/tmp/ctable2.ctable')
> >
> ---------------------------------------------------------------------------
> > TypeError Traceback (most recent call
> last)
> > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b>
> in<module>()
> > ----> 1 ct2= ca.ctable((np.arange(30000000),), names=('range2',),
> rootdir='/tmp/ctable2.ctable')
> >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
> in__init__(self, cols, names, **kwargs)
> > 158 if column.dtype== np.void:
> > 159 raise ValueError, "`cols` elements
> cannot be of type void"
> > --> 160 column= ca.carray(column, **kwargs)
> > 161 elif ratype:
> > 162 column= ca.carray(cols[name], **kwargs)
> >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
> incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)()
> >
> > TypeError: __cinit__() got an unexpected keyword argument 'rootdir'
> >
> >
> > And this is cut from the pip output when trying to upgrade carray.
> >
> > gcc: carray/carrayExtension.c
> >
> > gcc: error: carray/carrayExtension.c: No such file or directory
>
> Hmm, that's strange, because the carrayExtension should have been
> cythonized automatically. Here it is part of my install process with pip:
>
> Running setup.py install for carray
> * Found Cython 0.17.1 package installed.
> * Found numpy 1.7.0b2 package installed.
> * Found numexpr 2.0.1 package installed.
> cythoning carray/carrayExtension.pyx to carray/carrayExtension.c
> building 'carray.carrayExtension' extension
> C compiler: gcc -fno-strict-aliasing
> -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3
> -Wall -Wstrict-prototypes
>
> Hmm, perhaps you need a newer version of Cython?
>
> >
> >
> > Two more notes:
> >
> > * a way was added to check in-disk (compressed) vs in-memory
> > (uncompressed) node sizes. I was unable to find the way to use it
> > either from the 2.4.0 release notes or from the git issue
> > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
>
> You already found the answer.
>
> >
> > * is/will it be possible to load PyTables carrays as in-memory carrays
> > without decompression?
>
> Actually, that has been my idea from the very beginning. The concept of
> 'flavor' for the returned objects when reading is already there, so it
> should be relatively easy to add a new 'carray' flavor. Maybe you can
> contribute this?
>
> --
> Francesc Alted
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
|
|
From: Francesc A. <fa...@gm...> - 2012-12-07 17:04:25
|
Hmm, perhaps cythonizing by hand is your best bet:
$ cython carray/carrayExtension.pyx
If you continue having problems, please write to the carray mailing list.
Francesc
On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote:
> I have now similar dependencies as you, except for Numpy 1.7 beta 2.
>
> I wish I could help with the carray flavor.
>
> --
> Running setup.py install for carray
> * Found Cython 0.17.2 package installed.
> * Found numpy 1.6.2 package installed.
> * Found numexpr 2.0.1 package installed.
> building 'carray.carrayExtension' extension
> C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC
> compile options: '-Iblosc
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> -I/usr/include/python2.7 -c'
> extra options: '-msse2'
> gcc: blosc/blosclz.c
> gcc: carray/carrayExtension.c
> gcc: error: carray/carrayExtension.c: No such file or directory
> gcc: fatal error: no input files
> compilation terminated.
> gcc: error: carray/carrayExtension.c: No such file or directory
> gcc: fatal error: no input files
> compilation terminated.
> error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe
> -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> -I/usr/include/python2.7 -c carray/carrayExtension.c -o
> build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed
> with exit status 4
>
>
>
> -á.
>
>
>
> On 7 December 2012 12:47, Francesc Alted <fa...@gm...
> <mailto:fa...@gm...>> wrote:
>
> On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote:
> > Thank you for the comprehensive round-up. I have some ideas and
> > reports below.
> >
> > What about ctables? The documentation says that it is specificly
> > column-access optimized, which is what I need in this scenario
> > (sometimes sequential, sometimes random).
>
> Yes, ctables is optimized for column access.
>
> >
> > Unfortunately I could not get the rootdir parameter for ctables
> > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads
> > to compilation errors.
>
> Yep, persistence for carray/ctables objects was added in 0.5.
>
> >
> > This is the ctables-to-disk error:
> >
> > ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
> > rootdir='/tmp/ctable2.ctable')
> >
> ---------------------------------------------------------------------------
> > TypeError Traceback (most
> recent call last)
> >
> /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b>
> in<module>()
> > ----> 1 ct2= ca.ctable((np.arange(30000000),),
> names=('range2',), rootdir='/tmp/ctable2.ctable')
> >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
> in__init__(self, cols, names, **kwargs)
> > 158 if column.dtype== np.void:
> > 159 raise ValueError, "`cols`
> elements cannot be of type void"
> > --> 160 column= ca.carray(column, **kwargs)
> > 161 elif ratype:
> > 162 column= ca.carray(cols[name], **kwargs)
> >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
> incarray.carrayExtension.carray.__cinit__
> (carray/carrayExtension.c:3917)()
> >
> > TypeError: __cinit__() got an unexpected keyword argument 'rootdir'
> >
> >
> > And this is cut from the pip output when trying to upgrade carray.
> >
> > gcc: carray/carrayExtension.c
> >
> > gcc: error: carray/carrayExtension.c: No such file or directory
>
> Hmm, that's strange, because the carrayExtension should have been
> cythonized automatically. Here it is part of my install process
> with pip:
>
> Running setup.py install for carray
> * Found Cython 0.17.1 package installed.
> * Found numpy 1.7.0b2 package installed.
> * Found numexpr 2.0.1 package installed.
> cythoning carray/carrayExtension.pyx to carray/carrayExtension.c
> building 'carray.carrayExtension' extension
> C compiler: gcc -fno-strict-aliasing
> -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3
> -Wall -Wstrict-prototypes
>
> Hmm, perhaps you need a newer version of Cython?
>
> >
> >
> > Two more notes:
> >
> > * a way was added to check in-disk (compressed) vs in-memory
> > (uncompressed) node sizes. I was unable to find the way to use it
> > either from the 2.4.0 release notes or from the git issue
> > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
>
> You already found the answer.
>
> >
> > * is/will it be possible to load PyTables carrays as in-memory
> carrays
> > without decompression?
>
> Actually, that has been my idea from the very beginning. The
> concept of
> 'flavor' for the returned objects when reading is already there, so it
> should be relatively easy to add a new 'carray' flavor. Maybe you can
> contribute this?
>
> --
> Francesc Alted
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add
> services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> <mailto:Pyt...@li...>
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
>
>
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
--
Francesc Alted
|
|
From: Alvaro T. C. <al...@mi...> - 2012-12-07 19:22:56
|
Thanks Francesc, that solved it. Having the disk datastructures load
compressed in memory can be a deal-breaker when you got daily 50Gb+
datasets to process!
The carray google group (I had not noticed it) seems unreachable at the
moment. That's why I am going to report a problem here for the moment. With
the following code
ct0 = ca.ctable((h5f.root.c_000[:],), names=('c_000',), rootdir=
u'/lfpd1/tmp/ctable-1', mode='w', cparams=ca.cparams(5), dtype='u2',
expectedlen=len(h5f.root.c_000))
for k in h5f.root._v_children.keys()[:3]: #just some of the HDF5 datasets
try:
col = getattr(h5f.root, k)
ct0.addcol(col[:], name=k, expectedlen=len(col), dtype='u2')
except ValueError:
pass #exists
ct0.flush()
>>> ct0
ctable((303390000,), [('c_000', '<u2'), ('c_007', '<u2'), ('c_006',
'<u2'), ('c_005', '<u2')])
nbytes: 2.26 GB; cbytes: 1.30 GB; ratio: 1.73
cparams := cparams(clevel=5, shuffle=True)
rootdir := '/lfpd1/tmp/ctable-1'
[(312, 37, 65432, 91) (313, 32, 65439, 65) (320, 24, 65433, 66) ...,
(283, 597, 677, 647) (276, 600, 649, 635) (298, 607, 635, 620)]
The newly-added datasets/columns exist in memory
>>> ct0['c_007']
carray((303390000,), uint16)
nbytes: 578.67 MB; cbytes: 333.50 MB; ratio: 1.74
cparams := cparams(clevel=5, shuffle=True)
[ 37 32 24 ..., 597 600 607]
but they do not appear in the rootdir, not even after .flush()
/lfpd1/tmp/ctable-1]$ ls
__attrs__ c_000 __rootdirs__
and something seems amiss with __rootdirs__:
/lfpd1/tmp/ctable-1]$ cat __rootdirs__
{"dirs": {"c_007": null, "c_006": null, "c_005": null, "c_000":
"/lfpd1/tmp/ctable-1/c_000"}, "names": ["c_000", "c_007", "c_006", "c_005"]}
>>> ct0.cbytes//1024**2
1334
vs
/lfpd1/tmp]$ du -h ctable-1
12K ctable-1/c_000/meta
340M ctable-1/c_000/data
340M ctable-1/c_000
340M ctable-1
and, finally, no 'open'
ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r')
---------------------------------------------------------------------------ValueError
Traceback (most recent call
last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-26-41e1cb01ffe6>
in <module>()----> 1 ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1',
mode='r')
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/toplevel.pyc
in open(rootdir, mode) 104 # Not a carray. Now with a
ctable 105 try:--> 106 obj =
ca.ctable(rootdir=rootdir, mode=mode) 107 except IOError:
108 # Not a ctable
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
in __init__(self, columns, names, **kwargs) 193 _new =
True 194 else:--> 195 self.open_ctable() 196
_new = False 197
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
in open_ctable(self) 282 283 # Open the ctable by
reading the metadata--> 284 self.cols.read_meta_and_open()
285 286 # Get the length out of the first column
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
in read_meta_and_open(self) 40 # Initialize the cols by
instatiating the carrays 41 for name, dir_ in
data['dirs'].items():---> 42 self._cols[str(name)] =
ca.carray(rootdir=dir_, mode=self.mode) 43 44 def
update_meta(self):
/home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
in carray.carrayExtension.carray.__cinit__
(carray/carrayExtension.c:8637)()
ValueError: You need at least to pass an array or/and a rootdir
-á.
On 7 December 2012 17:04, Francesc Alted <fa...@gm...> wrote:
> Hmm, perhaps cythonizing by hand is your best bet:
>
> $ cython carray/carrayExtension.pyx
>
> If you continue having problems, please write to the carray mailing list.
>
> Francesc
>
> On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote:
> > I have now similar dependencies as you, except for Numpy 1.7 beta 2.
> >
> > I wish I could help with the carray flavor.
> >
> > --
> > Running setup.py install for carray
> > * Found Cython 0.17.2 package installed.
> > * Found numpy 1.6.2 package installed.
> > * Found numexpr 2.0.1 package installed.
> > building 'carray.carrayExtension' extension
> > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC
> > compile options: '-Iblosc
> >
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> > -I/usr/include/python2.7 -c'
> > extra options: '-msse2'
> > gcc: blosc/blosclz.c
> > gcc: carray/carrayExtension.c
> > gcc: error: carray/carrayExtension.c: No such file or directory
> > gcc: fatal error: no input files
> > compilation terminated.
> > gcc: error: carray/carrayExtension.c: No such file or directory
> > gcc: fatal error: no input files
> > compilation terminated.
> > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe
> > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc
> >
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> > -I/usr/include/python2.7 -c carray/carrayExtension.c -o
> > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed
> > with exit status 4
> >
> >
> >
> > -á.
> >
> >
> >
> > On 7 December 2012 12:47, Francesc Alted <fa...@gm...
> > <mailto:fa...@gm...>> wrote:
> >
> > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote:
> > > Thank you for the comprehensive round-up. I have some ideas and
> > > reports below.
> > >
> > > What about ctables? The documentation says that it is specificly
> > > column-access optimized, which is what I need in this scenario
> > > (sometimes sequential, sometimes random).
> >
> > Yes, ctables is optimized for column access.
> >
> > >
> > > Unfortunately I could not get the rootdir parameter for ctables
> > > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1
> leads
> > > to compilation errors.
> >
> > Yep, persistence for carray/ctables objects was added in 0.5.
> >
> > >
> > > This is the ctables-to-disk error:
> > >
> > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
> > > rootdir='/tmp/ctable2.ctable')
> > >
> >
> ---------------------------------------------------------------------------
> > > TypeError Traceback (most
> > recent call last)
> > >
> > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b>
> > in<module>()
> > > ----> 1 ct2= ca.ctable((np.arange(30000000),),
> > names=('range2',), rootdir='/tmp/ctable2.ctable')
> > >
> > >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
> > in__init__(self, cols, names, **kwargs)
> > > 158 if column.dtype== np.void:
> > > 159 raise ValueError, "`cols`
> > elements cannot be of type void"
> > > --> 160 column= ca.carray(column, **kwargs)
> > > 161 elif ratype:
> > > 162 column= ca.carray(cols[name], **kwargs)
> > >
> > >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
> > incarray.carrayExtension.carray.__cinit__
> > (carray/carrayExtension.c:3917)()
> > >
> > > TypeError: __cinit__() got an unexpected keyword argument 'rootdir'
> > >
> > >
> > > And this is cut from the pip output when trying to upgrade carray.
> > >
> > > gcc: carray/carrayExtension.c
> > >
> > > gcc: error: carray/carrayExtension.c: No such file or directory
> >
> > Hmm, that's strange, because the carrayExtension should have been
> > cythonized automatically. Here it is part of my install process
> > with pip:
> >
> > Running setup.py install for carray
> > * Found Cython 0.17.1 package installed.
> > * Found numpy 1.7.0b2 package installed.
> > * Found numexpr 2.0.1 package installed.
> > cythoning carray/carrayExtension.pyx to carray/carrayExtension.c
> > building 'carray.carrayExtension' extension
> > C compiler: gcc -fno-strict-aliasing
> > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3
> > -Wall -Wstrict-prototypes
> >
> > Hmm, perhaps you need a newer version of Cython?
> >
> > >
> > >
> > > Two more notes:
> > >
> > > * a way was added to check in-disk (compressed) vs in-memory
> > > (uncompressed) node sizes. I was unable to find the way to use it
> > > either from the 2.4.0 release notes or from the git issue
> > >
> https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
> >
> > You already found the answer.
> >
> > >
> > > * is/will it be possible to load PyTables carrays as in-memory
> > carrays
> > > without decompression?
> >
> > Actually, that has been my idea from the very beginning. The
> > concept of
> > 'flavor' for the returned objects when reading is already there, so
> it
> > should be relatively easy to add a new 'carray' flavor. Maybe you
> can
> > contribute this?
> >
> > --
> > Francesc Alted
> >
> >
> >
> ------------------------------------------------------------------------------
> > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> > Remotely access PCs and mobile devices and provide instant support
> > Improve your efficiency, and focus on delivering more value-add
> > services
> > Discover what IT Professionals Know. Rescue delivers
> > http://p.sf.net/sfu/logmein_12329d2d
> > _______________________________________________
> > Pytables-users mailing list
> > Pyt...@li...
> > <mailto:Pyt...@li...>
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> > Remotely access PCs and mobile devices and provide instant support
> > Improve your efficiency, and focus on delivering more value-add services
> > Discover what IT Professionals Know. Rescue delivers
> > http://p.sf.net/sfu/logmein_12329d2d
> >
> >
> > _______________________________________________
> > Pytables-users mailing list
> > Pyt...@li...
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
> --
> Francesc Alted
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
|
|
From: Francesc A. <fa...@gm...> - 2012-12-07 19:37:07
|
Please, stop reporting carray problems here. Let's communicate
privately if you want.
Thanks,
Francesc
On 12/7/12 8:22 PM, Alvaro Tejero Cantero wrote:
> Thanks Francesc, that solved it. Having the disk datastructures load
> compressed in memory can be a deal-breaker when you got daily 50Gb+
> datasets to process!
>
> The carray google group (I had not noticed it) seems unreachable at
> the moment. That's why I am going to report a problem here for the
> moment. With the following code
>
> ct0 = ca.ctable((h5f.root.c_000[:],), names=('c_000',),
> rootdir=u'/lfpd1/tmp/ctable-1', mode='w', cparams=ca.cparams(5),
> dtype='u2', expectedlen=len(h5f.root.c_000))
>
> for k in h5f.root._v_children.keys()[:3]: #just some of the HDF5 datasets
> try:
> col = getattr(h5f.root, k)
> ct0.addcol(col[:], name=k, expectedlen=len(col), dtype='u2')
> except ValueError:
> pass #exists
> ct0.flush()
>
> >>> ct0
> ctable((303390000,), [('c_000', '<u2'), ('c_007', '<u2'), ('c_006', '<u2'), ('c_005', '<u2')])
> nbytes: 2.26 GB; cbytes: 1.30 GB; ratio: 1.73
> cparams := cparams(clevel=5, shuffle=True)
> rootdir := '/lfpd1/tmp/ctable-1'
> [(312, 37, 65432, 91) (313, 32, 65439, 65) (320, 24, 65433, 66) ...,
> (283, 597, 677, 647) (276, 600, 649, 635) (298, 607, 635, 620)]
>
> The newly-added datasets/columns exist in memory
>
> >>> ct0['c_007']
> carray((303390000,), uint16)
> nbytes: 578.67 MB; cbytes: 333.50 MB; ratio: 1.74
> cparams := cparams(clevel=5, shuffle=True)
> [ 37 32 24 ..., 597 600 607]
>
> but they do not appear in the rootdir, not even after .flush()
>
> /lfpd1/tmp/ctable-1]$ ls
> __attrs__ c_000 __rootdirs__
>
> and something seems amiss with __rootdirs__:
> /lfpd1/tmp/ctable-1]$ cat __rootdirs__
> {"dirs": {"c_007": null, "c_006": null, "c_005": null, "c_000":
> "/lfpd1/tmp/ctable-1/c_000"}, "names": ["c_000", "c_007", "c_006",
> "c_005"]}
>
> >>> ct0.cbytes//1024**2
> 1334
>
> vs
> /lfpd1/tmp]$ du -h ctable-1
> 12K ctable-1/c_000/meta
> 340M ctable-1/c_000/data
> 340M ctable-1/c_000
> 340M ctable-1
>
>
> and, finally, no 'open'
>
> ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r')
>
> ---------------------------------------------------------------------------
> ValueError Traceback (most recent call last)
> /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-26-41e1cb01ffe6> in<module>()
> ----> 1 ct0_disk= ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r')
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/toplevel.pyc inopen(rootdir, mode)
> 104 # Not a carray. Now with a ctable
>
> 105 try:
> --> 106 obj= ca.ctable(rootdir=rootdir, mode=mode)
> 107 except IOError:
> 108 # Not a ctable
>
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in__init__(self, columns, names, **kwargs)
> 193 _new= True
> 194 else:
> --> 195 self.open_ctable()
> 196 _new= False
> 197
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc inopen_ctable(self)
> 282
> 283 # Open the ctable by reading the metadata
>
> --> 284 self.cols.read_meta_and_open()
> 285
> 286 # Get the length out of the first column
>
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc inread_meta_and_open(self)
> 40 # Initialize the cols by instatiating the carrays
>
> 41 for name, dir_in data['dirs'].items():
> ---> 42 self._cols[str(name)] = ca.carray(rootdir=dir_, mode=self.mode)
> 43
> 44 def update_meta(self):
>
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:8637)()
>
> ValueError: You need at least to pass an array or/and a rootdir
>
> -á.
>
>
>
> On 7 December 2012 17:04, Francesc Alted <fa...@gm...
> <mailto:fa...@gm...>> wrote:
>
> Hmm, perhaps cythonizing by hand is your best bet:
>
> $ cython carray/carrayExtension.pyx
>
> If you continue having problems, please write to the carray
> mailing list.
>
> Francesc
>
> On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote:
> > I have now similar dependencies as you, except for Numpy 1.7 beta 2.
> >
> > I wish I could help with the carray flavor.
> >
> > --
> > Running setup.py install for carray
> > * Found Cython 0.17.2 package installed.
> > * Found numpy 1.6.2 package installed.
> > * Found numexpr 2.0.1 package installed.
> > building 'carray.carrayExtension' extension
> > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC
> > compile options: '-Iblosc
> >
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> > -I/usr/include/python2.7 -c'
> > extra options: '-msse2'
> > gcc: blosc/blosclz.c
> > gcc: carray/carrayExtension.c
> > gcc: error: carray/carrayExtension.c: No such file or directory
> > gcc: fatal error: no input files
> > compilation terminated.
> > gcc: error: carray/carrayExtension.c: No such file or directory
> > gcc: fatal error: no input files
> > compilation terminated.
> > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe
> > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
> > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
> > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
> > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc
> >
> -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include
> > -I/usr/include/python2.7 -c carray/carrayExtension.c -o
> > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed
> > with exit status 4
> >
> >
> >
> > -á.
> >
> >
> >
> > On 7 December 2012 12:47, Francesc Alted <fa...@gm...
> <mailto:fa...@gm...>
> > <mailto:fa...@gm... <mailto:fa...@gm...>>> wrote:
> >
> > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote:
> > > Thank you for the comprehensive round-up. I have some
> ideas and
> > > reports below.
> > >
> > > What about ctables? The documentation says that it is
> specificly
> > > column-access optimized, which is what I need in this scenario
> > > (sometimes sequential, sometimes random).
> >
> > Yes, ctables is optimized for column access.
> >
> > >
> > > Unfortunately I could not get the rootdir parameter for
> ctables
> > > __init__ to work in carray 0.4 and pip-installing 0.5 or
> 0.5.1 leads
> > > to compilation errors.
> >
> > Yep, persistence for carray/ctables objects was added in 0.5.
> >
> > >
> > > This is the ctables-to-disk error:
> > >
> > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',),
> > > rootdir='/tmp/ctable2.ctable')
> > >
> >
> ---------------------------------------------------------------------------
> > > TypeError Traceback (most
> > recent call last)
> > >
> > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b>
> > in<module>()
> > > ----> 1 ct2= ca.ctable((np.arange(30000000),),
> > names=('range2',), rootdir='/tmp/ctable2.ctable')
> > >
> > >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc
> > in__init__(self, cols, names, **kwargs)
> > > 158 if column.dtype== np.void:
> > > 159 raise ValueError, "`cols`
> > elements cannot be of type void"
> > > --> 160 column= ca.carray(column, **kwargs)
> > > 161 elif ratype:
> > > 162 column= ca.carray(cols[name],
> **kwargs)
> > >
> > >
> >
> /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so
> > incarray.carrayExtension.carray.__cinit__
> > (carray/carrayExtension.c:3917)()
> > >
> > > TypeError: __cinit__() got an unexpected keyword argument
> 'rootdir'
> > >
> > >
> > > And this is cut from the pip output when trying to upgrade
> carray.
> > >
> > > gcc: carray/carrayExtension.c
> > >
> > > gcc: error: carray/carrayExtension.c: No such file or
> directory
> >
> > Hmm, that's strange, because the carrayExtension should have
> been
> > cythonized automatically. Here it is part of my install process
> > with pip:
> >
> > Running setup.py install for carray
> > * Found Cython 0.17.1 package installed.
> > * Found numpy 1.7.0b2 package installed.
> > * Found numexpr 2.0.1 package installed.
> > cythoning carray/carrayExtension.pyx to
> carray/carrayExtension.c
> > building 'carray.carrayExtension' extension
> > C compiler: gcc -fno-strict-aliasing
> > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g
> -fwrapv -O3
> > -Wall -Wstrict-prototypes
> >
> > Hmm, perhaps you need a newer version of Cython?
> >
> > >
> > >
> > > Two more notes:
> > >
> > > * a way was added to check in-disk (compressed) vs in-memory
> > > (uncompressed) node sizes. I was unable to find the way to
> use it
> > > either from the 2.4.0 release notes or from the git issue
> > >
> https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763
> >
> > You already found the answer.
> >
> > >
> > > * is/will it be possible to load PyTables carrays as in-memory
> > carrays
> > > without decompression?
> >
> > Actually, that has been my idea from the very beginning. The
> > concept of
> > 'flavor' for the returned objects when reading is already
> there, so it
> > should be relatively easy to add a new 'carray' flavor.
> Maybe you can
> > contribute this?
> >
> > --
> > Francesc Alted
> >
> >
> >
> ------------------------------------------------------------------------------
> > LogMeIn Rescue: Anywhere, Anytime Remote support for IT.
> Free Trial
> > Remotely access PCs and mobile devices and provide instant
> support
> > Improve your efficiency, and focus on delivering more value-add
> > services
> > Discover what IT Professionals Know. Rescue delivers
> > http://p.sf.net/sfu/logmein_12329d2d
> > _______________________________________________
> > Pytables-users mailing list
> > Pyt...@li...
> <mailto:Pyt...@li...>
> > <mailto:Pyt...@li...
> <mailto:Pyt...@li...>>
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> > Remotely access PCs and mobile devices and provide instant support
> > Improve your efficiency, and focus on delivering more value-add
> services
> > Discover what IT Professionals Know. Rescue delivers
> > http://p.sf.net/sfu/logmein_12329d2d
> >
> >
> > _______________________________________________
> > Pytables-users mailing list
> > Pyt...@li...
> <mailto:Pyt...@li...>
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
> --
> Francesc Alted
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add
> services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> <mailto:Pyt...@li...>
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
>
>
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
--
Francesc Alted
|