Menu

#339 Add a less restrictive Integrity check option, which will pass certain archives as "Ok"

open
nobody
5
2019-11-09
2019-11-06
No

Problem: Using the ZIP format, there are some tools that allow a zip to be created a certain way, like leaving extra unused data space throughout files in the archive. When running a 7Zip "t" integrity check, these zips will fail (and further, fail to extract properly), where as other tools will pass it and extract it just fine.

Solution: The introduction of a less restrictive integrity check option would be fantastic and leave existing functionality of "t" untouched. Perhaps "tl" ('l' for less).

Details:
When distributing game data to thousands of clients, we wish to minimize the bits that trasnfer to them. Steam has tools to facilitate this, although additional compression on our side is desirable as we wish to keep certain assets compressed since this leads to faster load and stream times during gameplay. It isn't feasible to keep all assets uncompressed as SteamPipe would recommend.

So we intend to keep our assets packed and compressed in ZIP files. Yet we're working on new mechanisms where the zip archives only change what's minimally needed between two builds. This means leaving unused space sometimes in a file in an archive. All header data is still valid tho, size fields are still correct, offsets still point to the valid data, just extra bytes exists (in the case of shrinking an asset).

The 'tl' less restrictive integrity check would simply validate integrity based on information provided in headers, eg sizes, offsets. It would NOT do extra checks such as "does entire file size add up more or less to what the headers indicate?" That would still be left to "t"

Discussion

  • Igor Pavlov

    Igor Pavlov - 2019-11-06

    You must create some zip example that is good for you and that fails with 7-Zip.
    Then we can discuss about it.

     
  • Michael K Saucedo

    Of course. The one that says "extraSpace" is the one 7-zip reports errors on, yet other zip programs report 'all OK' (such as IZArc, and win10 built-in zip).
    The only difference between the extraSpace one is that a file called Client.json has been truncated from 2KB to 1byte (yet I keep all the 2KB there and just update the size fields).

     

    Last edit: Michael K Saucedo 2019-11-06
  • Igor Pavlov

    Igor Pavlov - 2019-11-06

    Latest 7-Zip 19.02 works OK with such archives:

    Characteristics = Unsorted_CD
    

    Probably that Unsorted_CD (Central Directory) is the problem with previous 7-Zip.
    If you store Central Directory in same order as local headers, then old 7-Zip also will work with such archives.

     

    Last edit: Igor Pavlov 2019-11-06
  • Michael K Saucedo

    ahh, the alpha release? Didn't think to check that. Thanks

     
  • Michael K Saucedo

    Found a "bug" (or ate least, a "weird behavior") in the alpha. Created the same two zip, except in the second one I simply deleted the Client.json. IZArc shows the file as gone, yet 7zip shows it as present. It seems 7zip is using the local file header to know what's present, when it should be exclusively using the CDR. Any reason for this? I'd hate to switch to IZArc.

     
  • Igor Pavlov

    Igor Pavlov - 2019-11-07

    Local headers are more important than Central headers.
    7-Zip reads Local headers only if it sees contradictions between first file of archive and central header.
    That is your case.

     
    • Michael K Saucedo

      So you're saying I might be able to work around it if I make a constant first occurring local File header (plus its payload), and then also always have this the first entry in the CDR?

       
      • Igor Pavlov

        Igor Pavlov - 2019-11-08

        7-Zip works so:
        It reads first local header and central directory.
        The it compares them.
        If they are OK, then it shows central directory.
        If they are not OK, then it reads all local headers.

        Latest 7-Zip 19.02 allows unsorted central directory.

         
        • Michael K Saucedo

          Ok sounds like perhaps with some code mods on our pak writing side, we might get around that behavior.
          The last issue I see is with 0-byt files. Yes, I agree, they shouldn't be there, but 7zip, gives a data error on it, whereas others (IZArc) pass.

           

          Last edit: Michael K Saucedo 2019-11-08
          • Igor Pavlov

            Igor Pavlov - 2019-11-09

            1) Check with another zip tools also
            2) Try to create smallest example of such archive. Maybe you can remove another files from archive.

             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.