|
From: Francesc A. <fa...@py...> - 2004-06-22 11:28:07
|
A Dimarts 22 Juny 2004 12:47, Francesc Alted va escriure: > As I'm doing tests with a very slow hard disk, I used very repetitive data > (all zeros) to bypass the bottleneck, but the results are barely the same. > So, the bottleneck seems to be in the I/O calls, indeed. Ops, I forgot to say that this is using compression. > In order to determine if the problem was PyTables or the HDF5 layer, I used > a small C program that opens the EArray only once, write all the data, and > then close the array (PyTables, on its hand, always open and close on every > append() operation). With that, I was able to achieve 7.7 MB/s, so very > close of the writing limits of my disk. When using compression (zlib, > complevel=1) and shuffling, however, I was able to achieve 22 MB/s. So, > perhaps it would be feasible to reach 30 MB/s or more without using > compression by using this kind of optimized writing on a system that > supports faster writing speeds, like yours. An small update: I re-made this C benchmark, but using only the zlib compressor (i.e. whitout shuffling) and setting all data to zeros, and I've obtained 33 MB/s. Without compression, that figure may perfectly grow up to 40 MB/s (whenever the hard disk would support this throughput, of course). -- Francesc Alted |