Compression speed with gzip

2004-11-16
2012-12-08
  • Nobody/Anonymous

    Hi,

    Is it just me or is the compression speed of 7-zip very slow when using the gzip format, as opposed to using the command line utility from www.gzip.com.

    I'm using WinXP SP2 on a 1.7Ghz 512Mb PC with 7-Zip 4.10 and i'm compressing a 652Mb text file (Oracle database dump file) using normal gzip compression and a word size of 32.  It takes in excess of 15 mins to compress the file in 7-Zip, but only 1min 1s to compress the same file with all standard options using the command line utility of gzip 1.3.5.  Apparently, 7-zip uses the algorithm from gzip.com, so that cannot be the difference, and just because 7-zip is a GUI should not affect the performance in this way.  Can anyone with any knowledge of this explain the reason for the slowdown?

    I have noticed that the other algorithms also seem quite slow when compared to WinZip and WinRAR, but this is just a general feeling and not something I have investigated further.  Has anyone else noticed this difference in speed or tried to benchmark 7-Zip against other compression utilities?

    Regards,
    Andrew

     
    • my space

      my space - 2004-11-16

      > Is it just me or is the compression speed of 7-zip very slow when using the gzip format,
      > as opposed to using the command line utility from http://www.gzip.com.

      Yes, you are right : 7za is often slower than gzip.

      My test :
      87224320 glibc-2.3.1.tar
      17882515 glibc-2.3.1.tar.gz  - 27 sec with gzip 1.3.5 from cygwin  (gzip -9 glibc-2.3.1.tar)
      16010310 glibc-2.3.1.tar.gz  - 40 sec with 7za 4.10b (7za a -tgzip glibc-2.3.1.tar.gz glibc-2.3.1.tar)

      So 7-zip is twice slower, but 7-zip has a better compression ratio on the .gz format !

      > Apparently, 7-zip uses the algorithm from gzip.com, so that cannot be the difference
      No : 7-zip uses another algorithm to produce a smaller file in the Deflate compressor !

      By the way, gzip has some parts of code that are written in pure assembler,
      whereas 7-zip is pure C/C++ code.

      conclusion : 7-zip is often slower but 7-zip often compresses better.

       
    • Nobody/Anonymous

      Hi and thanks for the reply.

      > Apparently, 7-zip uses the algorithm from gzip.com, so that cannot be the difference
      >No : 7-zip uses another algorithm to produce a smaller file in the Deflate compressor !

      I had only assumed that the algorithm was the same as Jean-Loup Gailly's web site (www.gzip.org - sorry for the bad link in main post) had mentioned that 7-zip was a program that used "the gzip compression code", which leads the reader to believe that he is referring to his own code.  From his web-site:

      "Is there a Windows interface for gzip?
      PowerArchiver 6.1, 7-zip and Winzip include the gzip compression code and can decompress .gz and tar.gz files. Win-GZ can compress and decompress files in gzip format. Please note that gzip, 7-zip, PowerArchiver 6.1 and Win-GZ are freeware but you must register Winzip and PowerArchiver > 6.1 if you use them regularly."

      I think it is a shame that 7-Zips implementation is so slow.  WinRAR's RAR format and 7-Zip's 7z format are both good competitors for best compression (which takes considerably longer than ZIP to compress), but I would also like an option to compress to a reasonable size, but very quickly.  ZIP format fulfills this quite well, but the gzip format has the potential to compress to roughly the same size as ZIP (in most cases a few Kb lower), in a very quick time.  If this is due to the gzip CLI being written in assembly then there is nothing to do about it - it's unreasonable to expect Igor to put that much work in and re-write portions of code in assembly to gain a speed increase when the functionality is ok, but if it is something else like unneccessary code in a loop that is slowing things down, then maybe he could look at optimising things.

      Regards,
      Andrew

       
    • Nobody/Anonymous

      Hi,

      I just want to clear up what I think is a misunderstanding:  I realise that 7-zip contains it's own algorithm for compression called 7z, amongst other algorithms, and in my main post, I am referring to the code for the gzip algorithm when I say that '7-Zip uses the algorithm from www.gzip.org'.  I do not mean that the 7z format uses the algorithm from that web site.  As I am lead to believe by www.gzip.org that the code for gzip is the same, I wonder why the slowdown is so great.

      Regards,
      Andrew

       
    • Nobody/Anonymous

      The slowdown mainly comes from the major speed optimization in gzip. Furthermore, gzip sacrifies the compression ratio in favor of speed.

       

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks