From: Francesc A. <fa...@ca...> - 2007-08-24 12:36:06
|
Hi Elias, A Thursday 23 August 2007, escrigu=C3=A9reu: > Francesc, > > Here's my setup: > -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-= =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- >=3D-=3D-=3D-=3D PyTables version: 1.3 > HDF5 version: 1.6.5 > numarray version: 1.5.1 > Zlib version: 1.2.1 > BZIP2 version: 1.0.2 (30-Dec-2001) > Python version: 2.4.3 (#1, Apr 21 2006, 14:31:08) > [GCC 3.3.3 (SuSE Linux)] > Platform: linux2-x86_64 > Byte-ordering: little > -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-= =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- >=3D-=3D-=3D-=3D > > I recently switched from 'h5import' to PyTables to convert the output > from large finite element models into HDF5 format. I like using the > PyTables approach because it gives me more control than the shell > scripts that I cobbled together to use 'h5import' > > However, the most recent file takes much longer to search. Here is > the results of a simple test I ran with old and new databases: > > 'New': > $ python test_finder.py > Found 3 results for your search > CQUAD4 1121910 > fh.find('1121910') took 2.37 sec > Found 3 results for your search > fh.find('1121910', gpf=3DTrue) took 9.44 sec > > 'Old': > $ python test_finder.py > Found 3 results for your search > CQUAD4 1121910 > fh.find('1121910') took 0.664 sec > Found 3 results for your search > fh.find('1121910', gpf=3DTrue) took 0.638 sec > > The only difference I could detect between the two files was that the > PyTables version is the 'shuffle' parameter. Here is some ptdump > output of some nodes: > 'New': > $ ptdump -v xxx_lev_1_1.h5:/results/oef1/quad4 > /results/oef1/quad4 (EArray(1022L, 17759L, 3L), shuffle, zlib(6)) '' > atom =3D Atom(dtype=3D'Float32', shape=3D(0, 17759L, 3L), > flavor=3D'numarray') nrows =3D 1022 > extdim =3D 0 > flavor =3D 'numarray' > byteorder =3D 'little' ^^^^^^^^ <- Notice this > > 'Old': > $ ptdump -v xxx_lev_0.h5:/results/oef1/quad4 > /cluster/stress/methods/local/lib/python2.4/site-packages/tables/File >.py:227: UserWarning: file ``xxx_lev_0.h5`` exists and it is an HDF5 > file, but it does not have a PyTables format; I will try to do my > best to guess what's there using HDF5 metadata > METADATA_CACHE_SIZE, nodeCacheSize) > /results/oef1/quad4 (EArray(1018L, 17402L, 3L), zlib(6)) '' > atom =3D Atom(dtype=3D'Float32', shape=3D(0, 17402L, 3L), > flavor=3D'numarray') nrows =3D 1018 > extdim =3D 0 > flavor =3D 'numarray' > byteorder =3D 'big' ^^^^^ <- Notice this > > My client code is completely unchanged with this testing: only the > databases were created by two different methods. I have yet to do > more testing with smaller files (these are ~2.2G). I read the section > on shuffling in the manual where it suggest that shuffle will > actually improve throughput. but this is the only difference I could > detect. It is not a trivial matter to produce these large files, so I > need to get it right. I know it's not much to go on, but any > suggestions are appreciated. As I remarked above, another difference is that the 'new' files are=20 converted to little-endian byteorder, and that could affect performance=20 if you process those files on a big-endian machine. However, my guess is that the real problem in this case could=20 effectively lie in the shuffle filter. The thing is that in PyTables=20 1.x series, the algorithm for computing the chunksize (i.e. the size=20 where compression applies) was not very fine-tuned, and the computed=20 size for it can be as high as 600KB, putting too much stress on the=20 shuffle filter. This has been somewhat bettered in 2.x series, so that=20 the chunksize for your files (~2.2 GB) would be something like 32KB or=20 64KB, which is a more reasonable figure for shuffling (besides of=20 allowing far better performance in sparse reads). So, you may want to try PyTables 2.0 or, if you want to stick with 1.3,=20 try disabling the shuffle filter (at the expense of reducing the=20 compression effectiveness) when creating the 'new' arrays. My=20 recommendation, though, is you to switch to 2.0 as there are more=20 optimizations (like using numpy natively and others) that can help=20 improving your times still more. Cheers, =2D-=20 >0,0< Francesc Altet =C2=A0 =C2=A0 http://www.carabos.com/ V V C=C3=A1rabos Coop. V. =C2=A0=C2=A0Enjoy Data "-" |