Menu

LZMS compression ratio

nuhi
2015-01-28
2015-03-19
  • nuhi

    nuhi - 2015-01-28

    Hi,

    I have a standard Win8.1 install.wim, with two images in it, size 3.6GB.

    Comparing two types of compression with wimlib:
    wimlib-imagex.exe export install.wim 1 install.wim --compress=lzx --rebuild
    size 3.46

    vs
    wimlib-imagex.exe export install.wim 1 install.esd --compress=lzms --rebuild
    size 3.3GB

    Which makes me wonder, are you sure that the ESD compression works as expected in wimlib?
    I know you told me that a several percent is less compressed than Microsoft library, but this is like 20% or so, maybe there is a bug or too lightly set compression defaults.
    I was expecting something like 2.8GB the most, I believe it was 2.6 with WIMGAPI, would need to retest later to be certain.

    Is there an option maybe, through chunk size setup or otherwise, to increase the LZMS compression ratio at the expense of more memory used?

    Thanks.

     
  • synchronicity

    synchronicity - 2015-01-28

    With just --compress=lzms, you are creating a "non-solid" WIM. The compression format is LZMS, but each file is compressed independently and the compression chunk size is still relatively small. What you want is to create a "solid" WIM, which you can do with this command line:

    wimlib-imagex.exe export install.wim 1 install.esd --solid
    

    Then multiple files will be compressed as one, and the compression chunk size will be very large. This is the mode usually used in "ESD" files.

     
  • nuhi

    nuhi - 2015-01-28

    Thanks Eric, that helped.

    It's now 2.9GB. Also checked WIMGAPI, 2.63GB - but it crashed Explorer while sucking up all of the memory. If only they had a thread count option, or better available memory detection.

    Will see if wimlib chunk size makes any difference to reduce the difference, let me know if you have any other of the top ideas. I'll be using it from the library, so I guess it's a little more flexible.

    Thanks.

     
  • synchronicity

    synchronicity - 2015-01-28

    It's possible to increase the compression level (e.g. --solid-compress=lzms:100) or the compression chunk size (e.g. --solid-chunk-size=67108864), however I wouldn't currently recommend either since the improvement in compression ratio is not very good, given the drawbacks.

    As far as memory usage is concerned, currently the default for --solid is LZMS with 32 MiB byte chunks which uses about 480 MiB of memory per thread. I don't remember the statistics for WIMGAPI, but it probably uses more memory (at least per thread) since it uses 64 MiB chunks by default. wimlib uses as many threads as it thinks will fit in available memory, but the thread count can be set manually if desired.

     

    Last edit: synchronicity 2015-01-28
  • nuhi

    nuhi - 2015-02-09

    Reporting; finally tried and it worked, size is now comparable (30MB diff at 2.6GB) to WIMGAPI at chunk size 128Mbit. (Read it from the WIMGAPI created image header)
    ~10GB of memory taken during compression, now that's geeky.

    Thanks Eric, excellent library.

     
  • synchronicity

    synchronicity - 2015-02-10

    Hi,

    Yes, my current implementation should be able to beat WIMGAPI's compression ratio if you use a large enough chunk size. But as you noticed, that's not usually not realistic due to the high memory usage and slow running time. Also, WIMGAPI seems to be incompatible with chunk sizes above 64 MiB.

    I have, however, recently been working on improvements to LZMS compression, and I've been able to beat WIMGAPI's compression ratio by nearly 1% without using a larger chunk size than they're using. A few benchmarks:

            Implementation          Arch    Time    Size    ChunkSize
    
            wimlib (experimental)   x86_64  94.4s   51861K  64MiB
            wimlib (experimental)   x86_64  50.8s   52227K  32MiB
            WIMGAPI (8.1u1)         x86_64  68.2s   52288K  64MiB
            wimlib (master)         x86_64  73.2s   53420K  64MiB   
            wimlib (master)         x86_64  36.8s   53760K  32MiB
    

    I'll post back here if I get to a point where the changes are ready for others to test. Ideally, I'll be able to reduce memory usage enough to make 64 MiB chunks the default.

     
  • nuhi

    nuhi - 2015-02-11

    That sounds great, looking forward to it. Thanks.

     
  • synchronicity

    synchronicity - 2015-02-16

    I've posted files for wimlib-v1.8.0-BETA4 in the "testing" directory. It includes improvements for LZMS compression as well as solid compression in general. Feel free to try it out.

     
  • nuhi

    nuhi - 2015-02-16

    Tried it. Size is now as you said, smaller than WIMGAPI, and compression is more efficient.

    wimgapi - 2.63GB
    wimlib 1.8.b4 - 2.60GB
    wimlib 1.8.b4 128 - 2.56GB

    128 did not want to accept the product key to start installation (probably can't read the image), but that is expected as you said chunk size above 64 is not compatible.

    1.8.b4 normal compressed image installed.

    Great update, well done.

    Do you think it's safe to use this beta?
    Like if it compressed and installed it's fine, or you expect potential issues that are not obvious at first.

     
  • synchronicity

    synchronicity - 2015-02-16

    You should consider v1.8.0-BETA4 to be experimental, since the code is so new.

    That being said, I've already done some amount of testing and there are no known issues.

    One extra safety check you could do is to run wimlib-imagex verify on the WIM file after creating it.

     
  • synchronicity

    synchronicity - 2015-02-26

    Hi,

    I've released wimlib v1.8.0, which includes the improvements to LZMS and solid compression, among other things. I've been doing quite a bit of testing, including testing for compatibility with WIMGAPI, and at this point I believe the new LZMS and solid compression support is safe to use. But as always, let me know if you run into any issues.

     
  • nuhi

    nuhi - 2015-03-18

    Hi Eric,

    thanks for the update, been great so far.

    One note though.

    In HyperV UEFI mode, I noticed ESD requires more than 2GB of RAM to install, otherwise it (sometimes) errors out around 20% extraction.

    Let me know if this is unexpected. It's not a request, just a report.

    Thanks.

     

    Last edit: nuhi 2015-03-18
  • synchronicity

    synchronicity - 2015-03-18

    If you're just extracting a WIM (or "ESD") image, then that would be unexpected because the amount of memory required for extraction won't be that large unless there are millions of files. But I don't know exactly what you're doing; probably it's another issue entirely.

     
  • nuhi

    nuhi - 2015-03-19

    OK, then I must have done something else, that's good to know.

    I tried now to replicate the issue, and it did not fail.

    Sorry for the potentially false alarm, and will keep you updated in the future.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.