From: Raimar S. <rai...@ui...> - 2014-10-22 14:26:24
|
Dear András, how is everything going in Budapest? I hope you are well. Maybe you remember the time when we discussed the binary boost serialization, back then I was under the impression that compressing the state files did not gain much disk space. Now with my real simulations however I found that it makes a huge difference in disk space, and therefore I need to compress the state files with bzip2. I thought it would be nice if, in Python, I could read in the compressed state files directly. Therefore I added transparent bzip2 support for _reading_ of state files in the branch called "compression". This is done by changing the interface to std::istream instead of std::ifstream and working with boost's filtering_istream, to which one can add a decompressor if necessary. This is actually quite nice. On the downside we need yet another binary boost library, boost_iostreams. This library is detected by cmake, and compression support is enabled or disabled accordingly. In principle the _writing_ of compressed state files does also work. However there is a problem with continuation: we end up with several full bzip2-files concatenated. This is actually legal, it is called a multi-stream bzip2 file and I can decompress these files with bunzip2 without problems. However, unfortunately the boost decompressor has a bug and cannot handle these files. As a result, I have disabled the _writing_ of compressed state files at the moment. Instead, I just use the external bzip2 after the trajectory is done. If I want to continue a trajectory, I have to decompress first of course. One could solve this problem easily by compressing the individual archives inside the state file, not the file as a whole. But then one cannot use the external bzip2/bunzip2 utilities on the state files anymore. What do you think of compression as such, and of the implementation? Best regards Raimar |