Menu

#1486 Inconsistent PPMd compression ratio

None
open-rejected
nobody
None
5
2015-02-20
2015-02-20
quanta
No

In 7-Zip 9.22 beta, when compressing HWiNFO32.INI from HWiNFO32 v4.20-1960[1], the compressed size (not size of archive) depends on archive format:

HWiNFO32.INI original size: 15463 bytes

Format word size=7/8/9/10/11/12/13 (dictionary size=1MB*, ultra (-mx=9) compression level, solid block size=solid (7z))
7z     3928/3954/3957/4074/4232/4205/3954
ZIP    3929/3929/3932/3937/3941/3941/3944

* 7-Zip always sets dictionary size to 256kB when creating 7z archive.

Dictionary size change did not explain the discrepencies of compressed file size, since 256kB is sufficient memory for holding the entire dictionary for the given word sizes. Discrepencies aside, the compressed size actually increases only when compressed into 7z archive when word size goes down from 13 to 11. Whatever the cause is, the compressed sizes should be the same across archive formats for the identical word and dictionary sizes combinations. More importantly, reducing word sizes must not cause compressed size to grow because of compressing into different archive format.

[1] See HWiNFO32.INI from attachment HWiNFO32-v4.20-1960.zip

1 Attachments

Discussion

  • Igor Pavlov

    Igor Pavlov - 2015-02-20

    1) different versions of PPMD:
    7z uses ppmd var.H
    zip uses ppmd var.I

    2) Dictionary size for PPMD is buffer size. 1 MB buffer in PPMD can be overflowed even with 256 KB file.

     
  • Igor Pavlov

    Igor Pavlov - 2015-02-20
    • status: open --> open-rejected
    • Group: -->
     
  • quanta

    quanta - 2015-02-20

    Do you even read the article before you reject it? First of all, it is 7-Zip that insists on using 256kB dictionary/buffer when 1MB is chosen via 7-Zip. If the dictionary overflows, it only happens because it is 7-Zip's fault for prematurely forcing a wrong dictionary size. Secondly, the original file size is 15463 bytes, not 256kB, so your claim of overflowing simply cannot happen using the supplied attachment and word sizes. Thirdly, you failed to address the unusual increase in compressed size in 7z archive when word size goes down from 13 to 11 but not when decreasing word size from 11 to 7, which use the same PPMd version.

    PPMd version issue aside, it seems there are improvements to be made for the encoding process to improve compression so that decreasing word size will always decrease compressed size unless word sizes are too small. However, the way the bug is rejected raises the question of whether there is any plan for actually fix the bug, or whether pretending this bug is never filed would be a preferred response.

     
  • Igor Pavlov

    Igor Pavlov - 2015-02-20

    I don't see any bug.
    Compression ratio can be higher or lower for different values of "Word size" option.

    Dictionary size is OK. For such small file, 7-Zip reduces dictionary size from 1 MB to 256 KB.

     

Log in to post a comment.

MongoDB Logo MongoDB