RE: [lc-devel] WKdm Problem
Status: Beta
Brought to you by:
nitin_sf
From: Scott K. <sfk...@am...> - 2006-04-17 18:27:48
|
Nitin, > One particular thing I'm trying to improve is to minimize > extra memory it requires during compression. Presently, > it stores tags and other bits one per byte (in case of > partial or exact match) and later packs them together. > I think, this packed output can be directly generated > without this intermediate step -- this should greatly > reduce its memory footprint. I think you're optimizing the wrong thing here. We chose to defer the packing of bits in this manner intentionally: it improves the algorithm's speed substantially (by more than 3x). In exchange, we did need temporary space to store the bits, but the arrays required still leave WK* using less book-keeping space than any other compressor we encountered. For example, LZO needs a 64 KB dictionary, while WKdm uses something like 20 KB. Moreover, real systems have 10's to 100's of thousands of pages (e.g. 256 MB, small by current standards, is 65,536 page frames). Use of 5 (WKdm) to 16 (LZO) pages for temporary data structures accounts for 0.008% to 0.02% of total main memory consumption respectively. Shrinking the footprint will yield no performance improvement -- the noise in measuring running time will be much larger than the performance gained by increasing main memory capacity by 0.02% (in the best case!). Meanwhile, you will have slowed compression substantially, increasing the cost of keeping pages compressed and thus limiting the ideal compressed cache size. You will have lost much more in overall performance than you gained. If you want to improve the WK* algorithms, try: (a) A 32- or 64-word dictionary (b) Varying the associativity (not just direct-mapped and 4-way set associative) (c) Hand-code it in assembly for speed > Does WK4x4 gives better compression (than WKdm) for textual=20 > data also (since I'm currently working on compressing only=20 > page-cache data)? Somewhat, but not especially. Both assume that the fundamental input symbol for compression is a word, not a byte. It's looking for regularities in integer and pointer values, which is not what most file data is going to look like. > I think better compression with slightly slower algorithm=20 > will be better than lower compression but somewhat faster algorithm. Please see my dissertation: http://www.cs.amherst.edu/~sfkaplan/papers/sfkaplan-dissertation.ps.gz I've already done this analysis. Basically, you can choose either improved speed or tightness, and either one will comparably improve overall performance, although through different compressed cache sizes. Saying that one is more desirable than another is nonsensical -- both are equally valuable here. What you *do* want is the ability either (a) to improve compressibility for a minimal decrease in speed, or (b) to improve speed with a minimal decrease in compressibility. If you give up too much of one for the other, you've lost. > Can you please suggest any other alternates to LZO for=20 > compressing file-system pages? The closest match is LZRW1, but it's plainly not as fast or tight as LZO. There may be newer ones out there, but I'm not aware of anything as fast as LZO in the Lempel-Ziv family, and speed is critical here. > In particular, I'm looking at SquashFS(a compressed=20 > file-system) which uses 'zlib' for compression and is very=20 > fast and stable (it's widely used for LiveCDs). Zlib uses the modified LZ77 that is the basis of gzip. It's faster than most file-system level compressors, but it's substantially slower than LZO or WK*. It's possible that processors are fast enough now that zlib will provide *some* benefit, but you will definitely be losing out on the potential gains that LZO (or something like it) could provide on byte-oriented data. Scott |