Menu

Does all parts of multipart ZIP should be exact same size (except last part)?

2022-03-29
2023-11-22
  • Andrey Klimkin

    Andrey Klimkin - 2022-03-29

    I'm compressing a folder with roughly half a million files/folders about 1 terabyte total size. Yikes!
    I asked 7zip to produce multipart archive and set volume size to 5gbytes.
    I have first 4 parts the same size of 5 368 709 120 bytes, but starting from 5th they are all different - from 3 460 to 5 357 thousand bytes.

    Is this a sign of failed compression procedure? Should I worry about it? Will I be able to decompress the resulting archive?

     
  • Igor Pavlov

    Igor Pavlov - 2022-03-29

    It looks like some failure.
    Zip encoding code in 7-Zip is pretty complicated because it supports complex things like multithreading, rewriting for incompressible data, internal data write caching and some another things. So maybe some things do not work as expected in some rare cases, if your report with problem is correct.
    Further investigation of that problem is required.

    Do you use latest 7-Zip 21.07?
    What exact mode and settings do you use?
    What is size of largest file in data set?

     

    Last edit: Igor Pavlov 2022-03-29
  • Andrey Klimkin

    Andrey Klimkin - 2022-03-29

    Thanks Igor for the prompt response.
    For me now it seems like incorrect file size display issue in Windows 10 rather than 7-zip issue.
    After copying those above mentioned files to another volume they all magically become "right" size - both at the destination and at the source :-o
    It does not matter what program I use to list a directory - windows explorer, total commander or 7-zip itself - they all report the same incorrect size (screenshots attached). Seems like underlying file system API bug?
    I personally was unable to find anything similar to this behavior in the internet. There are some discussions about incorrect folder size calculations under Win10 because of the long pathnames. But for me this looks like different issue.

     
  • Andrey Klimkin

    Andrey Klimkin - 2022-03-30

    As it seems to me, the 7-zip does not close (release) the file handle to the current part when it reaches its designated size and skips to the next part. In fact, the file handles of all parts of multivolume archive stays open until it finishes processing all the data to the very end. Until then Windows could not for some reason show correct file size of each part.

    Can anyone confirm/deny this theory? Does anyone at all seen this sort of behavior?

     
  • Igor Pavlov

    Igor Pavlov - 2022-03-30

    7-zip can rewrite any data in any volume. So 7-Zip needs to keep all volumes open.
    For example, zip local header for each file contains compressed size, so 7-zip rewrites zip header after file data compression.

     
  • Andrey Klimkin

    Andrey Klimkin - 2022-03-30

    OK, thanks for comment. That explains the issue.
    UPD (for those interested in technical details):
    Why Windows reports wrong size for open files

     

    Last edit: Andrey Klimkin 2022-03-31
  • luciferr

    luciferr - 2023-11-22

    No, this is not necessarily a sign of a failed compression procedure. It is possible that the first four parts are simply more compressible than the later parts. 7-Zip will try to compress each part as much as possible, and the size of each part will depend on how much it can be compressed.

     

    Last edit: luciferr 2023-12-02

Log in to post a comment.