Menu

Automatically detect best compression method

2006-12-19
2012-12-08
1 2 > >> (Page 1 of 2)
  • Nobody/Anonymous

    When compressing files, I often find myself manually compressing the same files 3 times, once with each supported compression method (LZMA, PPMd, BZip2), and keep the one with the best compression ratio.

    Is it possible to somehow do it automatically?

     
    • Nobody/Anonymous

      Some sort of preview perhaps, were 7zip quick analyzes the data.

       
      • snn47a

        snn47a - 2008-06-10

        Imho it's not just which method LZMA or PPMd is best for a single file type, but which is best for a mix of different types.

        If you take the time to try different settings (dictionary/wordsize,) you can achieve miraciously improvement but I don't think this can be achieved with just a reference table, because compressability will change. E.g.:
        I tried  to archive a folder with 7zip and WinRAR consisting of

        2025 files 139 folder            968 705 491    973 701 120
        .jpg    718files 718/719 MB
        .png    361 files    105/106MB
        .htm*     31 files 1.25/1.32MB
        .html    131files 337/3,66 MB   
        .gif     339 796kB/1,74 MB
        .doc    7 files    13.1 MB

        As you can see most formats provide about the same compression ratio until I decreased the dictionary and word size, thats when I got the large improvement.

        packer    format    time    size
        7z    zip    10:40    927516
        rar    rar    6:03    926089
        7z    7z    20~~    920647    Bzip2
        rar    zip    1:49    920422
        7z    7z        905435    PPMd  256 dic 32 word 4GB
        7z    7z    9:40    901917    LZMA    64 MB duc, 256 word, 4GB    923/880 1632 kB/s
        7z    7z    26:37    894956    916 434 753    PPMd 1024 dic, 32Word, 4GB
        7z    7z    26:56    890 906    PPMd 1024 MB 8 MB word 1GB        923/869
                    ========PPMd 1024 MB 8 MB word 64GB

         
        • deity

          deity - 2008-06-10

          please try this test with FreeArc to compare

           
          • snn47a

            snn47a - 2008-06-10

            I ran a few tests the best I achieved 904232 kB.

             
            • deity

              deity - 2008-06-10

              Hmm...more size on max when 7-zip?

               
    • Nobody/Anonymous

      I wouldn't mind even full compression, as long as I don't have to do it manually! :)

       
    • Nobody/Anonymous

      Hello everyone,

      possible workaround:
      - write a batch with the three wanted commandlines ending up with three different archives
      - make the batch compare them
      - make the batch delete the second- and the third-best
      Integrating such function into 7-Zip could possibly slow it down enormous and thus make it unusable in the eyes of the common user.
      Interesting idea though!

      Best regards!

       
    • Nobody/Anonymous

      for the record: a 7zip profiler that does something similar exists. Check

      http://sourceforge.net/forum/message.php?msg_id=3988231

      Ddot

       
    • Paul Bryson

      Paul Bryson - 2006-12-21

      Yes, please test my profiler.  It will test all file extensions in a directory with different compression methods to see which is best (where "best" means smallest size).  Once you have 'profiled' all of the file types you want, you can compress them certain extensions with a certain method, and then compress others with a different method and add those to the original archive.  It should make efficient batch archiving easy.

       
      • Nobody/Anonymous

        I don't know how to use it...

         
    • Nobody/Anonymous

      Imagine this.....

      You compress a DLL file. You expect LZMA to work best on this one. Wrong, PPMD worked better on that particular DLL file. When you try with the next one, LZMA works best.

      It's so unpredictable to compress data, even on the same filetypes. You need some analyzing from 7zip itself to be sure of what works best. Or an autoswitching algorithm that quickly analyzes the data and select the proper algorithm.

      Me think......

       
    • Nobody/Anonymous

      And perhaps a small "learning" function. 7zip stores the results of the analyzed data in a text or html file and uses that data to compare and recognize new similar data. Like an opening book for chess programs.

       
    • Nobody/Anonymous

      To speed things up. :)

      - The Swede

       
    • Nobody/Anonymous

      Hello everyone,

      correct me if I'm wrong, but hadn't there been effort by Igor to sort files by a list to improve compression (at least at solid archives)?

      Best regards!

       
    • Nobody/Anonymous

      Please tell me how to use this profile?

       
    • Nobody/Anonymous

      "Imagine this.....

      You compress a DLL file. You expect LZMA to work best on this one. Wrong, PPMD worked better on that particular DLL file. When you try with the next one, LZMA works best.

      It's so unpredictable to compress data, even on the same filetypes. You need some analyzing from 7zip itself to be sure of what works best. Or an autoswitching algorithm that quickly analyzes the data and select the proper algorithm. "

      Like the powerful new WinZip 11.0!

       
    • Nobody/Anonymous

      And Winrar!

       
    • Nobody/Anonymous

      >> You compress a DLL file. You expect LZMA to work best on this one. Wrong, PPMD worked better on that particular DLL file. When you try with the next one, LZMA works best. 

      That is why you profile using several DLLs: to find the setting that is best on average.

      >> Or an autoswitching algorithm that quickly analyzes the data and select the proper algorithm.

      This can be achieved via pre-filters I think.

       
    • deity

      deity - 2007-05-20

      http://www.winzip.com/whatsnew111.htm
      Our new "best compression" option allows you to let WinZip decide the best compression method for each file based on the file type. This will ensure that you maximize the compression of every file that you add to your Zip file....

      Are you planning to do it for 7-zip?

       
      • Igor Pavlov

        Igor Pavlov - 2007-05-21

        - Are you planning to do it for 7-zip?

        Yes, but that task is more difficult for solid archives.

         
        • deity

          deity - 2008-06-10

          One year from promise create autodetection...
          I think now it not important.
          Why?
          Look at http://freearc.org
          New powerful archiver FreeArc written by Bulat Ziganshin can do it now!
          -Overall, 11 compression algorithms and filters are included (compared to 3 in 7-zip and 7 in RAR) and this number still grows
          -Includes LZMA, PPMD, TrueAudio and generic Multimedia compression algorithms with automatic switching by file type
          7-zip good arciver,but time to die. 

           
          • Alex

            Alex - 2008-06-10

            totally disagree with the time to die comment, freearc is still missing essential features for me e.g. volumes, also it is written using other peoples compression routines, for me 7-zip is by far the best, yes is could do with some new features like the auto file type stuff, but the stuff under the hood is excelent.

             
            • deity

              deity - 2008-06-10

              Yes,may be so, but on next versions of FreeArc time to die for 7-zip EXACTLY!
              Planning for FreeArc
              Version 0.60
              Dictionary of up to 1 GB in LZMA
              backup support (save file times / attributes / ACLs)
              Reed-Solomon codes for data recovery
              Issue freearc.dll to support. arc in other archiver
              integration with Explorer
              full, at the level of WinRAR, GUI
              Version 0.70
              - and catch up with peregnat RAR!
              multi-volume and recovery volumes
              work with archives containing millions of files
              zip support and other archival formats
              Version 0.80
              - the maximum compression!
              optimization multicore CPU
              Support compression algorithms with several exits
              bcj2
              segmentation files with further dozhatiem lzma / ppmd / multimedia algorithms
              bmp/tif- file compression algorithm
              change the format of the archive, in particular blur recovery record on the entire archive

               
            • Bulat Ziganshin

              Bulat Ziganshin - 2008-06-10

              >totally disagree with the time to die comment

              me too ;)  just look at the download rates - millions for 7-zip and thousands for fa

              but you say that fa us wriiten using existing compression algos and it is the bad thing. i don't think so. fa uses best free compression methods available and i think it's very good. just look at the picture:

              - it includes lzma and ppmd algos. like 7-zip
              - it adds audio compression with True Audio algo - one of the best lossless audio compressors
              - it mixes the things with automatic detection of filetype - i.e. program autodetects which algorithm to use. if you have experience optimizing 7zip archives you know what this means
              - next it adds all the same filters as RAR and little more - REP (which is like lrzip, quite popular large-dictionary filter), DICT (that may be compared with XML-WRT), LZP (which is recommended by PPMD author to improve compression ratio)
              - even more, it adds two more algorithms specifically for fast compression modes
              - using filetype detection feature, it skips compression on already compressed files

              just look at the list. it is exactly the features that you have asked for a years. from 11 algos in fa my own are 5 ones. these are small algos, mainly filters. when it was possible, i've used existing algos and implemented myself only methods that lacks free, open-source implementation

              my goal was always the most optimal compression for users, not records, and i've reached this goal - look at http://www.maximumcompression.com/data/summary_mf2.php#data

              nevertheless, i don't think that freearc was *already* killed 7-zip. it should become much more stable and provide reasonable GUI, at very least.

              also i don't think that we compete with Igor. i think that main Igor's work is lzma algorithm itself. it's excellent. it is the thing he is paid for. 7-zip by itself looks like a tool to show lzma potential for potential clients. it's solid, highly reliable, has a huge user base but nothing more

              fa, on the other side, is concentrated not on showing lzma strength but on providing the best compression technologies available as well as best archiving features - updatable solid archives, portability, rich command line. so it also advertizes lzma by maing it necessary part of the best archiver at the planet. and you, users, win - because you got excellent lzma compression together with a rich set of other outstanding algos and a lot of archive processing features that you've also asked for a years. we just work in cooperation. and i think that it's much better for users rather than try to develop everything from scratch

               
1 2 > >> (Page 1 of 2)

Log in to post a comment.

MongoDB Logo MongoDB