I just tested compression speed of XZ Utils (in Ubuntu) and 7-zip (in Windows), both compressing to .xz (tested interoperability). I used "default" settings for xz, with -6 and -9 compression levels. On 7-zip, I disabled multithreading (and checked it only used one CPU when compressing), and set the dictionary size to match xz (8MB and 64MB).
The resulting files (for each dictionary size) were very close in size (less than 0.2% difference), however XZ Utils took 19% more time with 8M dict, and 16% more time with 64M dict. Is this a known issue? Since compression speed is not a strong point of lzma(2), I think >15% speed is a significant difference that should be looked into.
I can give more details but I tested with different files on different CPUs and such, and 7-Zip was always faster. Decompression is too fast for me to measure accurately (need too much data to keep cached in RAM to get accurate results).
There can be multiple reasons:
- The compression settings mapped to the preset levels are not identical in XZ Utils and 7-Zip. Note that faster settings don't necessarily mean significantly worse compression; it depends on the file.
- There are settings other than the dictionary size that affect speed. Test e.g. "xz -6" and "xz -5" with a few different files. They both use 8 MiB dictionary.
- I haven't synced the latest improvements in LZMA SDK to XZ Utils. Also, the SDK code isn't used as is due to different API requirements and such things, so it is possible that XZ Utils has some internal overhead compared LZMA SDK.
- On 32-bit x86, if xz is linked against shared liblzma (liblzma.so.5), it has a small speed penalty that doesn't exist with static library (or DLLs on Windows). Passing -disable-shared to configure will give you xz tool that is linked against static liblzma.
- I don't know how significant this is, but 7-Zip has been compiled with MSVC while XZ Utils has been compiled with GCC.
You're right, default 7z uses nice_len of 32 and xz -6 uses 64. xz -5 however matches perfectly; in fact, a 2GB file which compresses to 400MB is within 4KB after compression (that's 0.001%), probably due to different block sizes or something (and 7z using CRC32 instead of CRC64), I'd say compression is exactly the same in practice.
In this case, and comparing 7z to the downloadable Windows version of xz (to factor out the OS at least), both 64-bit, 7z still manages to be 8% faster. This is within the realm of different compilers (although over 5% is somewhat questionable), but it's not too bad I guess.
I haven't tested many files but nice_len=64 seems a bad trade-off (6% time, 0.1% space gain).
The effects of different settings vary between files. I considered using nice=32 as the default before 5.0.0 release, but some people didn't like it due to the size difference with some files. I thought that people typically use xz to get good compression even if it is slow, so it makes sense to have a pretty high settings as the default. In most cases it is easy to set a different preset if one wants.
It's similar with the -extreme option. Sometimes it gives 5 % smaller result and sometimes it even makes the result slightly worse. So it is currently somewhat hit and miss. Maybe it will have bigger and better effect some day in the future.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.