From: Francesc A. <fa...@ca...> - 2005-04-12 09:32:06
|
Hi List, Ivan Vilata is working hard to implement Bzip2 (http://sources.redhat.com/bzip2/) and Szip (http://hdf.ncsa.uiuc.edu/doc_resource/SZIP/) support for PyTables. Bzip2 would be useful to achieve best compression ratios in data. Its compression/decompression speed is pretty slow, but nevertheless it can be rather useful for data archival. Szip can also achieve pretty good compression ratios, but it is not entirely free (you may need a license for using the Szip encodes for commertial aplications). We want to suport Szip mainly for HDF5 datafiles compatibility. Of course, we would like to do some benchmarks on these new compressors. We are planning to do an study also by using also the Shuffle filter (http://pytables.sourceforge.net/html-doc/x4273.html) as pre-conditioner. We think that it would be better if we can do this study using real-life data instead of synthetic data. So, if you are willing to offer some of your datafiles, please, do not hesitate to send them to us. We are looking for datafiles in the range of 1 MB ~ 100 MB. Of course, we would prefer that you can place them in a publicly accessible area instead of that you send us the data directly by e-mail. =46inally, the results of the study will be conveniently publicized so that any of you may take it to decide the best compressor/precontioner combination for your data. Thanks! =2D-=20 >qo< Francesc Altet =A0 =A0 http://www.carabos.com/ V =A0V C=E1rabos Coop. V. =A0=A0Enjoy Data "" |