Recursive archive compression

  • Nobody/Anonymous

    I often thought about a technique that could, under some circumstances, result in an even higher compression ratio than plain solid compression:

    Imagine, you have a folder with lots of files.
    Inside this folder, there are .ZIP, .RAR and other archives.

    While most other files will be smaller when compressed, the archives usually just add some overhead (file information) and they also "trash" the compression dictionary with almost (seemingly) random data.

    Compression ratio would be MUCH higher, if you would compress the files INSIDE those archives instead of putting them amongst other non-archive files. This would have most impact, if you'd make a solid archive.

    The main problem is obvious:
    What happens if you decode the archive?
    1) the encapsulated archives are re-created in their original format. That would be a bit difficult, because you'd have to be sure to keep the compatibility of the old format.
    2) create a new folder for each archive (and prepend the format extension like ".RAR" or ".7z" to the folder name)

    I have many folders containing "mini-backups" of sub-structures.
    A recursive archive compression would HEAVILY impact the compression of those folders.

    • Nobody/Anonymous

      This has already been discussed at another thread, I couldn't find it again, but I can recall som other problems with such a feature:

      3) It would take a long time programming
      4) Decompression would be much slower
      5) It doesn't guarantee better compression, actually it could give worse compression
      6) 7-zip would have to be able to compress in far more formats

      Once 7-zip is capable of decompressing all files inside a directory (and the directories under that directory, etc.), you could decompress whatever you want to, and then compress it.
      For now, if you want to compress it better just decompress everything manually, and then compress everything.

    • Nobody/Anonymous

      The biggest is problem is that 7-zip is lossless. If you use this technique it won't be lossless anymore because it's almost impossible to duplicate an archive.

      This technique would be handy as an extra option in another program like zipgenius or something. If you make an selfextractor you won't have to worry about compatibility.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks