From: Alvaro T. C. <al...@mi...> - 2012-12-05 18:56:00
|
My system was benched for reads and writes with Blosc[1]: with pt.openFile(paths.braw(block), 'r') as handle: pt.setBloscMaxThreads(1) %timeit a = handle.root.raw.c042[:] pt.setBloscMaxThreads(6) %timeit a = handle.root.raw.c042[:] pt.setBloscMaxThreads(11) %timeit a = handle.root.raw.c042[:] print handle.root.raw._v_attrs.FILTERS print handle.root.raw.c042.__sizeof__() print handle.root.raw.c042 gives 1 loops, best of 3: 483 ms per loop 1 loops, best of 3: 782 ms per loop 1 loops, best of 3: 663 ms per loop Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False) 104 /raw/c042 (CArray(303390000,), shuffle, blosc(5)) '' I can't understand what is going on, for the life of me. These datasets use int16 atoms and at Blosc complevel=5 used to compress by a factor of about 2. Even for such low compression ratios there should be huge differences between single- and multi-threaded reads. Do you have any clue? -á. [1] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks (first two plots) |
From: Francesc A. <fa...@gm...> - 2012-12-06 11:49:26
|
On 12/5/12 7:55 PM, Alvaro Tejero Cantero wrote: > My system was benched for reads and writes with Blosc[1]: > > with pt.openFile(paths.braw(block), 'r') as handle: > pt.setBloscMaxThreads(1) > %timeit a = handle.root.raw.c042[:] > pt.setBloscMaxThreads(6) > %timeit a = handle.root.raw.c042[:] > pt.setBloscMaxThreads(11) > %timeit a = handle.root.raw.c042[:] > print handle.root.raw._v_attrs.FILTERS > print handle.root.raw.c042.__sizeof__() > print handle.root.raw.c042 > > gives > > 1 loops, best of 3: 483 ms per loop > 1 loops, best of 3: 782 ms per loop > 1 loops, best of 3: 663 ms per loop > Filters(complevel=5, complib='blosc', shuffle=True, fletcher32=False) > 104 > /raw/c042 (CArray(303390000,), shuffle, blosc(5)) '' > > I can't understand what is going on, for the life of me. These > datasets use int16 atoms and at Blosc complevel=5 used to compress by > a factor of about 2. Even for such low compression ratios there should > be huge differences between single- and multi-threaded reads. > > Do you have any clue? Yeah, welcome to the wonderful art of fine tuning. Fortunately we have a machine which is pretty identical to yours (hey, your computer was too good in Blosc benchmarks so as to ignore it :), so I can reproduce your issue: In [3]: a = ((np.random.rand(3e8))*100).astype('i2') In [4]: f = tb.openFile("test.h5", "w") In [5]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape, filters=tb.Filters(5, complib="blosc")) In [6]: act[:] = a In [7]: f.flush() In [8]: ll test.h5 -rw-rw-r-- 1 faltet 301719914 Dec 6 04:55 test.h5 This random set of numbers is close to your array in size (~3e8 elements), and also has a similar compression factor (~2x). Now the timings (using 6 cores by default): In [9]: timeit act[:] 1 loops, best of 3: 441 ms per loop In [11]: tb.setBloscMaxThreads(1) Out[11]: 6 In [12]: timeit act[:] 1 loops, best of 3: 347 ms per loop So yeah, that might seem a bit disappointing. It turns out that the default chunksize for PyTables is tuned so as to balance among sequential and random reads. If what you want is to optimize only for sequential reads (apparently this is what you are after, right?), then it normally helps to increase the chunksize. For example, by doing some quick trials, I determined that a chunksize of 2 MB is pretty optimal for sequential access: In [44]: f.removeNode(f.root.act) In [45]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape, filters=tb.Filters(5, complib="blosc"), chunkshape=(2**20,)) In [46]: act[:] = a In [47]: tb.setBloscMaxThreads(1) Out[47]: 6 In [48]: timeit act[:] 1 loops, best of 3: 334 ms per loop In [49]: tb.setBloscMaxThreads(3) Out[49]: 1 In [50]: timeit act[:] 1 loops, best of 3: 298 ms per loop In [51]: tb.setBloscMaxThreads(6) Out[51]: 3 In [52]: timeit act[:] 1 loops, best of 3: 303 ms per loop Also, we see here that the sweet point is using 3 threads, not more (don't ask why). However, that does not mean that Blosc is not able to work faster on this machine, and in fact it does: In [59]: import blosc In [60]: sa = a.tostring() In [61]: ac2 = blosc.compress(sa, 2, clevel=5) In [62]: blosc.set_nthreads(6) Out[62]: 6 In [64]: timeit a2 = blosc.decompress(ac2) 10 loops, best of 3: 80.7 ms per loop In [65]: blosc.set_nthreads(1) Out[65]: 6 In [66]: timeit a2 = blosc.decompress(ac2) 1 loops, best of 3: 249 ms per loop So that means that a pure Blosc compression in-memory can only go 4x faster than PyTables + Blosc, and in this is case the latter is reaching an excellent mark of 2 GB/s, which is really good for a read from disk operation. Note how a memcpy() operation in this machine is just about as good as this: In [36]: timeit a.copy() 1 loops, best of 3: 294 ms per loop Now that I'm on this, I'm curious on how other compressors would perform for this scenario: In [6]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape, filters=tb.Filters(5, complib="lzo"), chunkshape=(2**20,)) In [7]: act[:] = a In [8]: f.flush() In [9]: ll test.h5 # compression ratio very close to Blosc -rw-rw-r-- 1 faltet 302769510 Dec 6 05:23 test.h5 In [10]: timeit act[:] 1 loops, best of 3: 1.13 s per loop so, the time for LZO is more than 3x slower than Blosc. And a similar thing with zlib: In [12]: f.close() In [13]: f = tb.openFile("test.h5", "w") In [14]: act = f.createCArray(f.root, 'act', tb.Int16Atom(), a.shape, filters=tb.Filters(1, complib="zlib"), chunkshape=(2**20,)) In [15]: act[:] = a In [16]: f.flush() In [17]: ll test.h5 # the compression rate is somewhat better -rw-rw-r-- 1 faltet 254821296 Dec 6 05:26 test.h5 In [18]: timeit act[:] 1 loops, best of 3: 2.24 s per loop which is 6x slower than Blosc (although the compression ratio is a bit better). And just for matter of completeness, let's see how fast can perform carray (the package, not the CArray object in PyTables) for a chunked array in-memory: In [19]: import carray as ca In [20]: ac3 = ca.carray(a, chunklen=2**20, cparams=ca.cparams(5)) In [21]: ac3 Out[21]: carray((300000000,), int16) nbytes: 572.20 MB; cbytes: 289.56 MB; ratio: 1.98 cparams := cparams(clevel=5, shuffle=True) [59 34 36 ..., 21 58 50] In [22]: timeit ac3[:] 1 loops, best of 3: 254 ms per loop In [23]: ca.set_nthreads(1) Out[23]: 6 In [24]: timeit ac3[:] 1 loops, best of 3: 282 ms per loop So, with 254 ms, it is only marginally faster than PyTables (~298 ms). Now with a carray object on-disk: In [27]: acd = ca.carray(a, chunklen=2**20, cparams=ca.cparams(5), rootdir="test") In [28]: acd Out[28]: carray((300000000,), int16) nbytes: 572.20 MB; cbytes: 289.56 MB; ratio: 1.98 cparams := cparams(clevel=5, shuffle=True) rootdir := 'test' [59 34 36 ..., 21 58 50] In [30]: ca.set_nthreads(6) Out[30]: 1 In [31]: timeit acd[:] 1 loops, best of 3: 317 ms per loop In [32]: ca.set_nthreads(1) Out[32]: 6 In [33]: timeit acd[:] 1 loops, best of 3: 361 ms per loop The times in this case are a bit larger than with PyTables (317ms vs 298ms), which speaks a lot how efficiently is implemented I/O in HDF5/PyTables stack. -- Francesc Alted |
From: Alvaro T. C. <al...@mi...> - 2012-12-06 12:42:57
|
Thank you for the comprehensive round-up. I have some ideas and reports below. What about ctables? The documentation says that it is specificly column-access optimized, which is what I need in this scenario (sometimes sequential, sometimes random). Unfortunately I could not get the rootdir parameter for ctables __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads to compilation errors. This is the ctables-to-disk error: ct2 = ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable') ---------------------------------------------------------------------------TypeError Traceback (most recent call last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> in <module>()----> 1 ct2 = ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable') /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in __init__(self, cols, names, **kwargs) 158 if column.dtype == np.void: 159 raise ValueError, "`cols` elements cannot be of type void"--> 160 column = ca.carray(column, **kwargs) 161 elif ratype: 162 column = ca.carray(cols[name], **kwargs) /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so in carray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)() TypeError: __cinit__() got an unexpected keyword argument 'rootdir' And this is cut from the pip output when trying to upgrade carray. gcc: carray/carrayExtension.c gcc: error: carray/carrayExtension.c: No such file or directory Two more notes: * a way was added to check in-disk (compressed) vs in-memory (uncompressed) node sizes. I was unable to find the way to use it either from the 2.4.0 release notes or from the git issue https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 * is/will it be possible to load PyTables carrays as in-memory carrays without decompression? Best, Álvaro On 6 December 2012 11:49, Francesc Alted <fa...@gm...> wrote: > completeness, let's see how fast can perform > carray (the package, n > |
From: Alvaro T. C. <al...@mi...> - 2012-12-06 18:30:02
|
I'll answer myself on the size-checking: the right attributes are Leaf.size_in_memory and Leaf.size_on_disk (per http://pytables.github.com/usersguide/libref/hierarchy_classes.html) -á. On 6 December 2012 12:42, Alvaro Tejero Cantero <al...@mi...> wrote: > Thank you for the comprehensive round-up. I have some ideas and reports > below. > > What about ctables? The documentation says that it is specificly > column-access optimized, which is what I need in this scenario (sometimes > sequential, sometimes random). > > Unfortunately I could not get the rootdir parameter for ctables __init__ > to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads to compilation > errors. > > This is the ctables-to-disk error: > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > rootdir='/tmp/ctable2.ctable') > > ---------------------------------------------------------------------------TypeError Traceback (most recent call last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> in <module>()----> 1 ct2 = ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable') > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in __init__(self, cols, names, **kwargs) 158 if column.dtype == np.void: 159 raise ValueError, "`cols` elements cannot be of type void"--> 160 column = ca.carray(column, **kwargs) 161 elif ratype: 162 column = ca.carray(cols[name], **kwargs) > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so in carray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)() > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > > And this is cut from the pip output when trying to upgrade carray. > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory > > > > Two more notes: > > * a way was added to check in-disk (compressed) vs in-memory > (uncompressed) node sizes. I was unable to find the way to use it either > from the 2.4.0 release notes or from the git issue > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > * is/will it be possible to load PyTables carrays as in-memory carrays > without decompression? > > Best, > > Álvaro > > > > On 6 December 2012 11:49, Francesc Alted <fa...@gm...> wrote: > >> completeness, let's see how fast can perform >> carray (the package, n >> > > |
From: Francesc A. <fa...@gm...> - 2012-12-07 12:47:12
|
On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > Thank you for the comprehensive round-up. I have some ideas and > reports below. > > What about ctables? The documentation says that it is specificly > column-access optimized, which is what I need in this scenario > (sometimes sequential, sometimes random). Yes, ctables is optimized for column access. > > Unfortunately I could not get the rootdir parameter for ctables > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads > to compilation errors. Yep, persistence for carray/ctables objects was added in 0.5. > > This is the ctables-to-disk error: > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > rootdir='/tmp/ctable2.ctable') > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> in<module>() > ----> 1 ct2= ca.ctable((np.arange(30000000),), names=('range2',), rootdir='/tmp/ctable2.ctable') > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in__init__(self, cols, names, **kwargs) > 158 if column.dtype== np.void: > 159 raise ValueError, "`cols` elements cannot be of type void" > --> 160 column= ca.carray(column, **kwargs) > 161 elif ratype: > 162 column= ca.carray(cols[name], **kwargs) > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)() > > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > And this is cut from the pip output when trying to upgrade carray. > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory Hmm, that's strange, because the carrayExtension should have been cythonized automatically. Here it is part of my install process with pip: Running setup.py install for carray * Found Cython 0.17.1 package installed. * Found numpy 1.7.0b2 package installed. * Found numexpr 2.0.1 package installed. cythoning carray/carrayExtension.pyx to carray/carrayExtension.c building 'carray.carrayExtension' extension C compiler: gcc -fno-strict-aliasing -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Hmm, perhaps you need a newer version of Cython? > > > Two more notes: > > * a way was added to check in-disk (compressed) vs in-memory > (uncompressed) node sizes. I was unable to find the way to use it > either from the 2.4.0 release notes or from the git issue > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 You already found the answer. > > * is/will it be possible to load PyTables carrays as in-memory carrays > without decompression? Actually, that has been my idea from the very beginning. The concept of 'flavor' for the returned objects when reading is already there, so it should be relatively easy to add a new 'carray' flavor. Maybe you can contribute this? -- Francesc Alted |
From: Alvaro T. C. <al...@mi...> - 2012-12-07 16:30:31
|
I have now similar dependencies as you, except for Numpy 1.7 beta 2. I wish I could help with the carray flavor. -- Running setup.py install for carray * Found Cython 0.17.2 package installed. * Found numpy 1.6.2 package installed. * Found numexpr 2.0.1 package installed. building 'carray.carrayExtension' extension C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC compile options: '-Iblosc -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c' extra options: '-msse2' gcc: blosc/blosclz.c gcc: carray/carrayExtension.c gcc: error: carray/carrayExtension.c: No such file or directory gcc: fatal error: no input files compilation terminated. gcc: error: carray/carrayExtension.c: No such file or directory gcc: fatal error: no input files compilation terminated. error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c carray/carrayExtension.c -o build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed with exit status 4 -á. On 7 December 2012 12:47, Francesc Alted <fa...@gm...> wrote: > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > > Thank you for the comprehensive round-up. I have some ideas and > > reports below. > > > > What about ctables? The documentation says that it is specificly > > column-access optimized, which is what I need in this scenario > > (sometimes sequential, sometimes random). > > Yes, ctables is optimized for column access. > > > > > Unfortunately I could not get the rootdir parameter for ctables > > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads > > to compilation errors. > > Yep, persistence for carray/ctables objects was added in 0.5. > > > > > This is the ctables-to-disk error: > > > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > > rootdir='/tmp/ctable2.ctable') > > > --------------------------------------------------------------------------- > > TypeError Traceback (most recent call > last) > > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> > in<module>() > > ----> 1 ct2= ca.ctable((np.arange(30000000),), names=('range2',), > rootdir='/tmp/ctable2.ctable') > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > in__init__(self, cols, names, **kwargs) > > 158 if column.dtype== np.void: > > 159 raise ValueError, "`cols` elements > cannot be of type void" > > --> 160 column= ca.carray(column, **kwargs) > > 161 elif ratype: > > 162 column= ca.carray(cols[name], **kwargs) > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:3917)() > > > > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > > > > And this is cut from the pip output when trying to upgrade carray. > > > > gcc: carray/carrayExtension.c > > > > gcc: error: carray/carrayExtension.c: No such file or directory > > Hmm, that's strange, because the carrayExtension should have been > cythonized automatically. Here it is part of my install process with pip: > > Running setup.py install for carray > * Found Cython 0.17.1 package installed. > * Found numpy 1.7.0b2 package installed. > * Found numexpr 2.0.1 package installed. > cythoning carray/carrayExtension.pyx to carray/carrayExtension.c > building 'carray.carrayExtension' extension > C compiler: gcc -fno-strict-aliasing > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 > -Wall -Wstrict-prototypes > > Hmm, perhaps you need a newer version of Cython? > > > > > > > Two more notes: > > > > * a way was added to check in-disk (compressed) vs in-memory > > (uncompressed) node sizes. I was unable to find the way to use it > > either from the 2.4.0 release notes or from the git issue > > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > You already found the answer. > > > > > * is/will it be possible to load PyTables carrays as in-memory carrays > > without decompression? > > Actually, that has been my idea from the very beginning. The concept of > 'flavor' for the returned objects when reading is already there, so it > should be relatively easy to add a new 'carray' flavor. Maybe you can > contribute this? > > -- > Francesc Alted > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Francesc A. <fa...@gm...> - 2012-12-07 17:04:25
|
Hmm, perhaps cythonizing by hand is your best bet: $ cython carray/carrayExtension.pyx If you continue having problems, please write to the carray mailing list. Francesc On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote: > I have now similar dependencies as you, except for Numpy 1.7 beta 2. > > I wish I could help with the carray flavor. > > -- > Running setup.py install for carray > * Found Cython 0.17.2 package installed. > * Found numpy 1.6.2 package installed. > * Found numexpr 2.0.1 package installed. > building 'carray.carrayExtension' extension > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC > compile options: '-Iblosc > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > -I/usr/include/python2.7 -c' > extra options: '-msse2' > gcc: blosc/blosclz.c > gcc: carray/carrayExtension.c > gcc: error: carray/carrayExtension.c: No such file or directory > gcc: fatal error: no input files > compilation terminated. > gcc: error: carray/carrayExtension.c: No such file or directory > gcc: fatal error: no input files > compilation terminated. > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > -I/usr/include/python2.7 -c carray/carrayExtension.c -o > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed > with exit status 4 > > > > -á. > > > > On 7 December 2012 12:47, Francesc Alted <fa...@gm... > <mailto:fa...@gm...>> wrote: > > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > > Thank you for the comprehensive round-up. I have some ideas and > > reports below. > > > > What about ctables? The documentation says that it is specificly > > column-access optimized, which is what I need in this scenario > > (sometimes sequential, sometimes random). > > Yes, ctables is optimized for column access. > > > > > Unfortunately I could not get the rootdir parameter for ctables > > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 leads > > to compilation errors. > > Yep, persistence for carray/ctables objects was added in 0.5. > > > > > This is the ctables-to-disk error: > > > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > > rootdir='/tmp/ctable2.ctable') > > > --------------------------------------------------------------------------- > > TypeError Traceback (most > recent call last) > > > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> > in<module>() > > ----> 1 ct2= ca.ctable((np.arange(30000000),), > names=('range2',), rootdir='/tmp/ctable2.ctable') > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > in__init__(self, cols, names, **kwargs) > > 158 if column.dtype== np.void: > > 159 raise ValueError, "`cols` > elements cannot be of type void" > > --> 160 column= ca.carray(column, **kwargs) > > 161 elif ratype: > > 162 column= ca.carray(cols[name], **kwargs) > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > incarray.carrayExtension.carray.__cinit__ > (carray/carrayExtension.c:3917)() > > > > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > > > > And this is cut from the pip output when trying to upgrade carray. > > > > gcc: carray/carrayExtension.c > > > > gcc: error: carray/carrayExtension.c: No such file or directory > > Hmm, that's strange, because the carrayExtension should have been > cythonized automatically. Here it is part of my install process > with pip: > > Running setup.py install for carray > * Found Cython 0.17.1 package installed. > * Found numpy 1.7.0b2 package installed. > * Found numexpr 2.0.1 package installed. > cythoning carray/carrayExtension.pyx to carray/carrayExtension.c > building 'carray.carrayExtension' extension > C compiler: gcc -fno-strict-aliasing > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 > -Wall -Wstrict-prototypes > > Hmm, perhaps you need a newer version of Cython? > > > > > > > Two more notes: > > > > * a way was added to check in-disk (compressed) vs in-memory > > (uncompressed) node sizes. I was unable to find the way to use it > > either from the 2.4.0 release notes or from the git issue > > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > You already found the answer. > > > > > * is/will it be possible to load PyTables carrays as in-memory > carrays > > without decompression? > > Actually, that has been my idea from the very beginning. The > concept of > 'flavor' for the returned objects when reading is already there, so it > should be relatively easy to add a new 'carray' flavor. Maybe you can > contribute this? > > -- > Francesc Alted > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add > services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > <mailto:Pyt...@li...> > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > > > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Francesc Alted |
From: Alvaro T. C. <al...@mi...> - 2012-12-07 19:22:56
|
Thanks Francesc, that solved it. Having the disk datastructures load compressed in memory can be a deal-breaker when you got daily 50Gb+ datasets to process! The carray google group (I had not noticed it) seems unreachable at the moment. That's why I am going to report a problem here for the moment. With the following code ct0 = ca.ctable((h5f.root.c_000[:],), names=('c_000',), rootdir= u'/lfpd1/tmp/ctable-1', mode='w', cparams=ca.cparams(5), dtype='u2', expectedlen=len(h5f.root.c_000)) for k in h5f.root._v_children.keys()[:3]: #just some of the HDF5 datasets try: col = getattr(h5f.root, k) ct0.addcol(col[:], name=k, expectedlen=len(col), dtype='u2') except ValueError: pass #exists ct0.flush() >>> ct0 ctable((303390000,), [('c_000', '<u2'), ('c_007', '<u2'), ('c_006', '<u2'), ('c_005', '<u2')]) nbytes: 2.26 GB; cbytes: 1.30 GB; ratio: 1.73 cparams := cparams(clevel=5, shuffle=True) rootdir := '/lfpd1/tmp/ctable-1' [(312, 37, 65432, 91) (313, 32, 65439, 65) (320, 24, 65433, 66) ..., (283, 597, 677, 647) (276, 600, 649, 635) (298, 607, 635, 620)] The newly-added datasets/columns exist in memory >>> ct0['c_007'] carray((303390000,), uint16) nbytes: 578.67 MB; cbytes: 333.50 MB; ratio: 1.74 cparams := cparams(clevel=5, shuffle=True) [ 37 32 24 ..., 597 600 607] but they do not appear in the rootdir, not even after .flush() /lfpd1/tmp/ctable-1]$ ls __attrs__ c_000 __rootdirs__ and something seems amiss with __rootdirs__: /lfpd1/tmp/ctable-1]$ cat __rootdirs__ {"dirs": {"c_007": null, "c_006": null, "c_005": null, "c_000": "/lfpd1/tmp/ctable-1/c_000"}, "names": ["c_000", "c_007", "c_006", "c_005"]} >>> ct0.cbytes//1024**2 1334 vs /lfpd1/tmp]$ du -h ctable-1 12K ctable-1/c_000/meta 340M ctable-1/c_000/data 340M ctable-1/c_000 340M ctable-1 and, finally, no 'open' ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') ---------------------------------------------------------------------------ValueError Traceback (most recent call last)/home/tejero/Dropbox/O/nb/nonridge/<ipython-input-26-41e1cb01ffe6> in <module>()----> 1 ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/toplevel.pyc in open(rootdir, mode) 104 # Not a carray. Now with a ctable 105 try:--> 106 obj = ca.ctable(rootdir=rootdir, mode=mode) 107 except IOError: 108 # Not a ctable /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in __init__(self, columns, names, **kwargs) 193 _new = True 194 else:--> 195 self.open_ctable() 196 _new = False 197 /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in open_ctable(self) 282 283 # Open the ctable by reading the metadata--> 284 self.cols.read_meta_and_open() 285 286 # Get the length out of the first column /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in read_meta_and_open(self) 40 # Initialize the cols by instatiating the carrays 41 for name, dir_ in data['dirs'].items():---> 42 self._cols[str(name)] = ca.carray(rootdir=dir_, mode=self.mode) 43 44 def update_meta(self): /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so in carray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:8637)() ValueError: You need at least to pass an array or/and a rootdir -á. On 7 December 2012 17:04, Francesc Alted <fa...@gm...> wrote: > Hmm, perhaps cythonizing by hand is your best bet: > > $ cython carray/carrayExtension.pyx > > If you continue having problems, please write to the carray mailing list. > > Francesc > > On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote: > > I have now similar dependencies as you, except for Numpy 1.7 beta 2. > > > > I wish I could help with the carray flavor. > > > > -- > > Running setup.py install for carray > > * Found Cython 0.17.2 package installed. > > * Found numpy 1.6.2 package installed. > > * Found numexpr 2.0.1 package installed. > > building 'carray.carrayExtension' extension > > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall > > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC > > compile options: '-Iblosc > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c' > > extra options: '-msse2' > > gcc: blosc/blosclz.c > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe > > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c carray/carrayExtension.c -o > > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed > > with exit status 4 > > > > > > > > -á. > > > > > > > > On 7 December 2012 12:47, Francesc Alted <fa...@gm... > > <mailto:fa...@gm...>> wrote: > > > > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > > > Thank you for the comprehensive round-up. I have some ideas and > > > reports below. > > > > > > What about ctables? The documentation says that it is specificly > > > column-access optimized, which is what I need in this scenario > > > (sometimes sequential, sometimes random). > > > > Yes, ctables is optimized for column access. > > > > > > > > Unfortunately I could not get the rootdir parameter for ctables > > > __init__ to work in carray 0.4 and pip-installing 0.5 or 0.5.1 > leads > > > to compilation errors. > > > > Yep, persistence for carray/ctables objects was added in 0.5. > > > > > > > > This is the ctables-to-disk error: > > > > > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > > > rootdir='/tmp/ctable2.ctable') > > > > > > --------------------------------------------------------------------------- > > > TypeError Traceback (most > > recent call last) > > > > > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> > > in<module>() > > > ----> 1 ct2= ca.ctable((np.arange(30000000),), > > names=('range2',), rootdir='/tmp/ctable2.ctable') > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > > in__init__(self, cols, names, **kwargs) > > > 158 if column.dtype== np.void: > > > 159 raise ValueError, "`cols` > > elements cannot be of type void" > > > --> 160 column= ca.carray(column, **kwargs) > > > 161 elif ratype: > > > 162 column= ca.carray(cols[name], **kwargs) > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > > incarray.carrayExtension.carray.__cinit__ > > (carray/carrayExtension.c:3917)() > > > > > > TypeError: __cinit__() got an unexpected keyword argument 'rootdir' > > > > > > > > > And this is cut from the pip output when trying to upgrade carray. > > > > > > gcc: carray/carrayExtension.c > > > > > > gcc: error: carray/carrayExtension.c: No such file or directory > > > > Hmm, that's strange, because the carrayExtension should have been > > cythonized automatically. Here it is part of my install process > > with pip: > > > > Running setup.py install for carray > > * Found Cython 0.17.1 package installed. > > * Found numpy 1.7.0b2 package installed. > > * Found numexpr 2.0.1 package installed. > > cythoning carray/carrayExtension.pyx to carray/carrayExtension.c > > building 'carray.carrayExtension' extension > > C compiler: gcc -fno-strict-aliasing > > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 > > -Wall -Wstrict-prototypes > > > > Hmm, perhaps you need a newer version of Cython? > > > > > > > > > > > Two more notes: > > > > > > * a way was added to check in-disk (compressed) vs in-memory > > > (uncompressed) node sizes. I was unable to find the way to use it > > > either from the 2.4.0 release notes or from the git issue > > > > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > > > You already found the answer. > > > > > > > > * is/will it be possible to load PyTables carrays as in-memory > > carrays > > > without decompression? > > > > Actually, that has been my idea from the very beginning. The > > concept of > > 'flavor' for the returned objects when reading is already there, so > it > > should be relatively easy to add a new 'carray' flavor. Maybe you > can > > contribute this? > > > > -- > > Francesc Alted > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > > Remotely access PCs and mobile devices and provide instant support > > Improve your efficiency, and focus on delivering more value-add > > services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > <mailto:Pyt...@li...> > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > > Remotely access PCs and mobile devices and provide instant support > > Improve your efficiency, and focus on delivering more value-add services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > > > > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > -- > Francesc Alted > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Francesc A. <fa...@gm...> - 2012-12-07 19:37:07
|
Please, stop reporting carray problems here. Let's communicate privately if you want. Thanks, Francesc On 12/7/12 8:22 PM, Alvaro Tejero Cantero wrote: > Thanks Francesc, that solved it. Having the disk datastructures load > compressed in memory can be a deal-breaker when you got daily 50Gb+ > datasets to process! > > The carray google group (I had not noticed it) seems unreachable at > the moment. That's why I am going to report a problem here for the > moment. With the following code > > ct0 = ca.ctable((h5f.root.c_000[:],), names=('c_000',), > rootdir=u'/lfpd1/tmp/ctable-1', mode='w', cparams=ca.cparams(5), > dtype='u2', expectedlen=len(h5f.root.c_000)) > > for k in h5f.root._v_children.keys()[:3]: #just some of the HDF5 datasets > try: > col = getattr(h5f.root, k) > ct0.addcol(col[:], name=k, expectedlen=len(col), dtype='u2') > except ValueError: > pass #exists > ct0.flush() > > >>> ct0 > ctable((303390000,), [('c_000', '<u2'), ('c_007', '<u2'), ('c_006', '<u2'), ('c_005', '<u2')]) > nbytes: 2.26 GB; cbytes: 1.30 GB; ratio: 1.73 > cparams := cparams(clevel=5, shuffle=True) > rootdir := '/lfpd1/tmp/ctable-1' > [(312, 37, 65432, 91) (313, 32, 65439, 65) (320, 24, 65433, 66) ..., > (283, 597, 677, 647) (276, 600, 649, 635) (298, 607, 635, 620)] > > The newly-added datasets/columns exist in memory > > >>> ct0['c_007'] > carray((303390000,), uint16) > nbytes: 578.67 MB; cbytes: 333.50 MB; ratio: 1.74 > cparams := cparams(clevel=5, shuffle=True) > [ 37 32 24 ..., 597 600 607] > > but they do not appear in the rootdir, not even after .flush() > > /lfpd1/tmp/ctable-1]$ ls > __attrs__ c_000 __rootdirs__ > > and something seems amiss with __rootdirs__: > /lfpd1/tmp/ctable-1]$ cat __rootdirs__ > {"dirs": {"c_007": null, "c_006": null, "c_005": null, "c_000": > "/lfpd1/tmp/ctable-1/c_000"}, "names": ["c_000", "c_007", "c_006", > "c_005"]} > > >>> ct0.cbytes//1024**2 > 1334 > > vs > /lfpd1/tmp]$ du -h ctable-1 > 12K ctable-1/c_000/meta > 340M ctable-1/c_000/data > 340M ctable-1/c_000 > 340M ctable-1 > > > and, finally, no 'open' > > ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') > > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-26-41e1cb01ffe6> in<module>() > ----> 1 ct0_disk= ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/toplevel.pyc inopen(rootdir, mode) > 104 # Not a carray. Now with a ctable > > 105 try: > --> 106 obj= ca.ctable(rootdir=rootdir, mode=mode) > 107 except IOError: > 108 # Not a ctable > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc in__init__(self, columns, names, **kwargs) > 193 _new= True > 194 else: > --> 195 self.open_ctable() > 196 _new= False > 197 > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc inopen_ctable(self) > 282 > 283 # Open the ctable by reading the metadata > > --> 284 self.cols.read_meta_and_open() > 285 > 286 # Get the length out of the first column > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc inread_meta_and_open(self) > 40 # Initialize the cols by instatiating the carrays > > 41 for name, dir_in data['dirs'].items(): > ---> 42 self._cols[str(name)] = ca.carray(rootdir=dir_, mode=self.mode) > 43 > 44 def update_meta(self): > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:8637)() > > ValueError: You need at least to pass an array or/and a rootdir > > -á. > > > > On 7 December 2012 17:04, Francesc Alted <fa...@gm... > <mailto:fa...@gm...>> wrote: > > Hmm, perhaps cythonizing by hand is your best bet: > > $ cython carray/carrayExtension.pyx > > If you continue having problems, please write to the carray > mailing list. > > Francesc > > On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote: > > I have now similar dependencies as you, except for Numpy 1.7 beta 2. > > > > I wish I could help with the carray flavor. > > > > -- > > Running setup.py install for carray > > * Found Cython 0.17.2 package installed. > > * Found numpy 1.6.2 package installed. > > * Found numexpr 2.0.1 package installed. > > building 'carray.carrayExtension' extension > > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall > > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC > > compile options: '-Iblosc > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c' > > extra options: '-msse2' > > gcc: blosc/blosclz.c > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe > > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c carray/carrayExtension.c -o > > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed > > with exit status 4 > > > > > > > > -á. > > > > > > > > On 7 December 2012 12:47, Francesc Alted <fa...@gm... > <mailto:fa...@gm...> > > <mailto:fa...@gm... <mailto:fa...@gm...>>> wrote: > > > > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > > > Thank you for the comprehensive round-up. I have some > ideas and > > > reports below. > > > > > > What about ctables? The documentation says that it is > specificly > > > column-access optimized, which is what I need in this scenario > > > (sometimes sequential, sometimes random). > > > > Yes, ctables is optimized for column access. > > > > > > > > Unfortunately I could not get the rootdir parameter for > ctables > > > __init__ to work in carray 0.4 and pip-installing 0.5 or > 0.5.1 leads > > > to compilation errors. > > > > Yep, persistence for carray/ctables objects was added in 0.5. > > > > > > > > This is the ctables-to-disk error: > > > > > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > > > rootdir='/tmp/ctable2.ctable') > > > > > > --------------------------------------------------------------------------- > > > TypeError Traceback (most > > recent call last) > > > > > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> > > in<module>() > > > ----> 1 ct2= ca.ctable((np.arange(30000000),), > > names=('range2',), rootdir='/tmp/ctable2.ctable') > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > > in__init__(self, cols, names, **kwargs) > > > 158 if column.dtype== np.void: > > > 159 raise ValueError, "`cols` > > elements cannot be of type void" > > > --> 160 column= ca.carray(column, **kwargs) > > > 161 elif ratype: > > > 162 column= ca.carray(cols[name], > **kwargs) > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > > incarray.carrayExtension.carray.__cinit__ > > (carray/carrayExtension.c:3917)() > > > > > > TypeError: __cinit__() got an unexpected keyword argument > 'rootdir' > > > > > > > > > And this is cut from the pip output when trying to upgrade > carray. > > > > > > gcc: carray/carrayExtension.c > > > > > > gcc: error: carray/carrayExtension.c: No such file or > directory > > > > Hmm, that's strange, because the carrayExtension should have > been > > cythonized automatically. Here it is part of my install process > > with pip: > > > > Running setup.py install for carray > > * Found Cython 0.17.1 package installed. > > * Found numpy 1.7.0b2 package installed. > > * Found numexpr 2.0.1 package installed. > > cythoning carray/carrayExtension.pyx to > carray/carrayExtension.c > > building 'carray.carrayExtension' extension > > C compiler: gcc -fno-strict-aliasing > > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g > -fwrapv -O3 > > -Wall -Wstrict-prototypes > > > > Hmm, perhaps you need a newer version of Cython? > > > > > > > > > > > Two more notes: > > > > > > * a way was added to check in-disk (compressed) vs in-memory > > > (uncompressed) node sizes. I was unable to find the way to > use it > > > either from the 2.4.0 release notes or from the git issue > > > > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > > > You already found the answer. > > > > > > > > * is/will it be possible to load PyTables carrays as in-memory > > carrays > > > without decompression? > > > > Actually, that has been my idea from the very beginning. The > > concept of > > 'flavor' for the returned objects when reading is already > there, so it > > should be relatively easy to add a new 'carray' flavor. > Maybe you can > > contribute this? > > > > -- > > Francesc Alted > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. > Free Trial > > Remotely access PCs and mobile devices and provide instant > support > > Improve your efficiency, and focus on delivering more value-add > > services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > <mailto:Pyt...@li...> > > <mailto:Pyt...@li... > <mailto:Pyt...@li...>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > > Remotely access PCs and mobile devices and provide instant support > > Improve your efficiency, and focus on delivering more value-add > services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > > > > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... > <mailto:Pyt...@li...> > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > -- > Francesc Alted > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add > services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > <mailto:Pyt...@li...> > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > > > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Francesc Alted |