Is 7z.dll support multithread call to Extract same Archive

blackwim
2013-11-22
2014-01-22
  • blackwim
    blackwim
    2013-11-22

    Is 7z.dll support multithread call to Extract Archive ?
    In this topic https://sourceforge.net/p/sevenzip/discussion/45797/thread/a0e2f3aa, Igor Pavlov said "Yes, you can extract from different archives with multithreading."
    Is safe to mulithread call extract same Archive to different dir? I know this scenario is weird, but sometimes my software need to extract same Archive to diffrent dir.

     
  • Igor Pavlov
    Igor Pavlov
    2013-11-22

    7z.dll supports that feature only for some formats and doesn't support that feature for another formats.
    So it's not safe to call it so.

     
  • blackwim
    blackwim
    2013-11-22

    I don't understand.
    Extract same archive to different dir, the archive just to be read, the target dir is different, and the object created by 7z.dll's CreateObject is also different, why this is not multithread safe?

     
  • Igor Pavlov
    Igor Pavlov
    2013-11-22

    You can use multithreading to extract from different archive objects. And each archive object must be open with different IInStream instance.

    You can't extract with multithreading from same archive object (for some archives types).

     
  • blackwim
    blackwim
    2013-11-22

    Thank you very much.

     
  • Vladimir
    Vladimir
    2014-01-15

    Использует ли 7zip алгоритм подобный тому что использует uberzip? Если нет, будет ли использован в будущем?

    http://www.matthicks.com/2008/01/multithreaded-unzip.html

    Does 7zip use similar algorithm like less known uberzip? If not, would it be used in a future?

    Ну и вопрос от дилетанта к программисту, что же всё таки мешает в теории сделать архиватор/формат архива, который при распаковке мог бы использовать все ядра процессора если запакован один файл. Этот вопрос как то слабо освещён в мировой прессе :)
    Спасибо!

     
  • Shell
    Shell
    2014-01-16

    I'll try to answer the second question. There are algorithms which must run in a single thread. For example, consider the extraction part of Delta compression. We have a row of differences stored in an archive: x[1], x[2]-x[1], x[3]-x[2], etc. We cannot then calculate x[3] without knowing x[2] first, and so on, so we have to process the x[i] sequentially. Now, if we want to use multithreading during extraction, we have to choose a compression method that is free of the above problem, but in this case we usually sacrifice the compression. For example, BWT supports multithreading during extraction, but it is not as effective as LZMA.

    There is another possibility for multithreading - to extract files from different solid blocks (or from algorithm-specific subblocks) in parallel. It is possible for any compression method in the case there really are several solid blocks. However, we usually cannot write the extracted files to the disk in parallel (we can, but it will be even slower), and since the disk subsystem is the slowest, the gain from such multithreading will be negligible.

     
  • Vladimir
    Vladimir
    2014-01-17

    Today we have an SSD sub systems and many years RAID-0/10 systems, so the disk speed is not a problem for the high-end servers and most of ssd desktops/laptops.

     
    • Shell
      Shell
      2014-01-17

      It is, because even the modern SSDs are slower than the RAM. Of course, if you have a RAM disk, then file-level multithreading will do the trick, but I think it has negligible impact on performance otherwise.

       
      Last edit: Shell 2014-01-17
  • Vladimir
    Vladimir
    2014-01-18

    Yes, this is true in most cases. But I found that not all options and files give the same result.
    If you are compressing an compressed avi (divx, xvid or whatever else), even if your compression option is just "fastest" and LZMA2, the unpacking speed is ~80MB/s that is far from the maximum RAM disk speed and even HDD speed.
    Another sample is VDI file (Virtualbox drive image) packed with ULTRA compression, archive size is 63% of original file (1,1GB)

    The unpacking speed still be ~80MB/s. And even changing compression to fastest does not change situation with unpacking. Only "store", without compression runs unpacking with speed over 1200MB/s
    What I am doing wrong? I think this time it is single thread bottleneck?

     
    Last edit: Vladimir 2014-01-18
    • Shell
      Shell
      2014-01-18

      I don't know the reason, but current LZMA2 implementation is not the fastest. If you want to increase decompression speed, try Deflate or BWT (i.e. bzip2) algorithms. For example, testing Deflate64-packed archive is 3 times faster on my laptop than testing a LZMA2-packed archive (the tests were conducted on a RAM disk).

      As for implementing multithreaded extraction, you may put a Feature request in the appropriate forum. But I think this wouldn't be implemented very quickly due to programming difficulties.

       
      • Vladimir
        Vladimir
        2014-01-22

        Yes I've tried bzip2 also and it also does not use full bandwidth on one big file in archive. But thank you for suggestion I'll create the request later.