Problem: Using the ZIP format, there are some tools that allow a zip to be created a certain way, like leaving extra unused data space throughout files in the archive. When running a 7Zip "t" integrity check, these zips will fail (and further, fail to extract properly), where as other tools will pass it and extract it just fine.
Solution: The introduction of a less restrictive integrity check option would be fantastic and leave existing functionality of "t" untouched. Perhaps "tl" ('l' for less).
Details:
When distributing game data to thousands of clients, we wish to minimize the bits that trasnfer to them. Steam has tools to facilitate this, although additional compression on our side is desirable as we wish to keep certain assets compressed since this leads to faster load and stream times during gameplay. It isn't feasible to keep all assets uncompressed as SteamPipe would recommend.
So we intend to keep our assets packed and compressed in ZIP files. Yet we're working on new mechanisms where the zip archives only change what's minimally needed between two builds. This means leaving unused space sometimes in a file in an archive. All header data is still valid tho, size fields are still correct, offsets still point to the valid data, just extra bytes exists (in the case of shrinking an asset).
The 'tl' less restrictive integrity check would simply validate integrity based on information provided in headers, eg sizes, offsets. It would NOT do extra checks such as "does entire file size add up more or less to what the headers indicate?" That would still be left to "t"
You must create some zip example that is good for you and that fails with 7-Zip.
Then we can discuss about it.
Of course. The one that says "extraSpace" is the one 7-zip reports errors on, yet other zip programs report 'all OK' (such as IZArc, and win10 built-in zip).
The only difference between the extraSpace one is that a file called Client.json has been truncated from 2KB to 1byte (yet I keep all the 2KB there and just update the size fields).
Last edit: Michael K Saucedo 2019-11-06
Latest 7-Zip 19.02 works OK with such archives:
Probably that Unsorted_CD (Central Directory) is the problem with previous 7-Zip.
If you store Central Directory in same order as local headers, then old 7-Zip also will work with such archives.
Last edit: Igor Pavlov 2019-11-06
ahh, the alpha release? Didn't think to check that. Thanks
Found a "bug" (or ate least, a "weird behavior") in the alpha. Created the same two zip, except in the second one I simply deleted the Client.json. IZArc shows the file as gone, yet 7zip shows it as present. It seems 7zip is using the local file header to know what's present, when it should be exclusively using the CDR. Any reason for this? I'd hate to switch to IZArc.
Local headers are more important than Central headers.
7-Zip reads Local headers only if it sees contradictions between first file of archive and central header.
That is your case.
So you're saying I might be able to work around it if I make a constant first occurring local File header (plus its payload), and then also always have this the first entry in the CDR?
7-Zip works so:
It reads first local header and central directory.
The it compares them.
If they are OK, then it shows central directory.
If they are not OK, then it reads all local headers.
Latest 7-Zip 19.02 allows unsorted central directory.
Ok sounds like perhaps with some code mods on our pak writing side, we might get around that behavior.
The last issue I see is with 0-byt files. Yes, I agree, they shouldn't be there, but 7zip, gives a data error on it, whereas others (IZArc) pass.
Last edit: Michael K Saucedo 2019-11-08
1) Check with another zip tools also
2) Try to create smallest example of such archive. Maybe you can remove another files from archive.