Multi-threaded compression, when?

funtoos
2010-01-12
2014-02-21
  • funtoos
    funtoos
    2010-01-12

    Any ideas on when will the multi-threaded compression be available?

     
  • Lasse Collin
    Lasse Collin
    2010-01-13

    It's one of the most important features to add after XZ Utils 5.0.0 is out. I cannot say anything more exact, since it depends on how quickly or slowly I get things done.

    Note that 7-Zip and p7zip 9 betas support the .xz format. They can use multi-threading (not limited to two threads) when compressing into the .xz format. It may be worth trying before XZ Utils catches up.

     
  • Now that 5.0.1 is out, I'm looking for a changelog, or a roadmap or something to indicate when xz will support multithreading…

     
  • Lasse Collin
    Lasse Collin
    2011-03-10

    I'm not able to give any schedule. This is a hobby for me. Recently I have been able to work on xz only a little.

    The positive thing is that you can use p7zip, lxz, or pxz to do threaded compression. p7zip is included in most distros already.

     
  • Thank you for the response.  Unfortunately, p7zip is only multithreaded in some distributions, and I was not able to get multithreaded from any of those packages (p7zip, lxz, pxz) on the platforms that I care about…  So I created another one. 

    threadzip ( http://code.google.com/p/threadzip/ )  is implemented in python and therefore highly cross-platform compatible.  Unfortunately the only library that ships with python by default is zlib (like gzip) so you have to add-on pylzma as a separate module.  This was very easy on some machines, but I didn't get it working on solaris (yet).

     
  • funtoos
    funtoos
    2011-03-13

    The problem with most parallel implementations is that they are not in-place replacements for the non-parallel ones. The program should take same options and work well with stdin/stdout.

     
  • My pixz utility does multi-threaded compression and decompression, with a file-format fully compatible with existing xz. I'd love any feedback.

     
  • chud
    chud
    2012-01-17

    Hi vasi,
    Could you please support stdin and stdout ?
    I am after a multithreaded decompressor (like pigz) for lzma/xz that works ok with a piped input stream and piped output. xz works but even the latest 5.1.1 alpha doesnt do multithreaded decompress.
    Thanks!

     
  • Threadzip uses multithreaded compress & decompress on stdin/stdout. 

    If you want to use lzma with threadzip, you need to install pylzma. 

     
  • Anyone working on a MT implementation, should just put their effort toward the xz-utils code base.  No other temporary implementation will bring that level of accomplishment. 

     
  • I'd also like to know what is the major implementation problem? is MT support not in the LZMA SDK?  More specifically, the code is not 20 years old like gzip, why wasn't MT support considered from conception?

     
  • Lasse Collin
    Lasse Collin
    2012-01-27

    The problem is that I haven't worked on the code much in the past months. I don't use LZMA SDK code as is. liblzma in XZ Utils has different API (buffer-to-buffer, no callbacks), which affects the internal implementation too.

    Current git snapshot is kind of usable already though. RAM usage would be lower in a more sophisticated implementation without affecting anything else, and the progress indicator doesn't work very well. But it does compress in parallel and has decent performance and shouldn't corrupt data. :-) I still don't recommend it for production use just to be safe.

     
    • Broken.zhou
      Broken.zhou
      2013-09-02

      Is this feature available now?

       
      Last edit: Broken.zhou 2013-09-02
  • Lasse Collin
    Lasse Collin
    2013-09-02

    There is 5.1.2alpha which has threading. A few minor fixes have been made after that release and they are available in the git repository.

    I still don't know when a stable release will be made. I don't plan much new things before 5.2.0. I just need to get it done.

     
  • Broken.zhou
    Broken.zhou
    2014-01-17

    I have tested the new version of xz and found it really great! Thanks for the work!

    In order to use the multithread feature, I need to add

    XZ_DEFAULTS="--threads 0"

    to my environment variable. Will this be the defaults setting finally?

     
  • Lasse Collin
    Lasse Collin
    2014-01-24

    Not in 5.2.0 and maybe never. The current implementation of threading makes the compression slightly worse. I currently guess that I will get fewer complaints by keeping the old behavior as the default, at least for the foreseeable future.

     
  • Victor
    Victor
    2014-02-20

    Can 5.2.0 mutithread feature be used in Linux?

     
  • Lasse Collin
    Lasse Collin
    2014-02-21

    Threading in 5.1.3alpha (and 5.2.0, whenever it gets released) works on GNU/Linux and other OSes that support pthreads. For Windows there's support for native threading APIs.