Menu

#46 Support extraction from ZIP files if "total number of disks” value "zip64 end of central dir locator" equals 0

open
nobody
None
5
2020-03-19
2020-03-17
No

I recently found out that p7zip is unable to extract large ZIP files that originate from Microsoft's OneDrive web client. Using the lates (16.02) version , extraction results in the following error:

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (506E3),ASM,AES-NI)

Scanning the drive for archives:
1 file, 6154566547 bytes (5870 MiB)

Extracting archive: kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip

ERRORS:
Headers Error
Unconfirmed start of archive


WARNINGS:
There are data after the end of archive

--
Path = kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip
Type = zip
ERRORS:
Headers Error
Unconfirmed start of archive
WARNINGS:
There are data after the end of archive
Physical Size = 4330182775
Tail Size = 1824383772

ERROR: CRC Failed : kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f/afd3f61a-5e0e-11ea-ab97-40b0341fbf5f/08.wav

Sub items Errors: 1

Archives with Errors: 1

Warnings: 1

Open Errors: 1

Sub items Errors: 1

After some investigating, it turns out that the problem is caused by the “total number of disks” field in the “zip64 end of central dir locator” of these files. The OneDrive web client sets this value to 0 (zero), whereas p7zip expects a value of 1. After changing the offending values with a Hex editor, affected files extract normally in p7zip.

I've posted a detailed report on the problem here:

https://www.bitsgalore.org/2020/03/11/does-microsoft-onedrive-export-large-ZIP-files-that-are-corrupt

Would it be possible to change the current behavior in order to allow such files to be extracted? If needed I can provide a sample file that demonstrates the problem.

BTW I'm fully aware the problem is primarily caused by faulty creator applications, but as these files appear to be widespread it would be really helpful if OS tools like p7zip would be able to deal with it.

Thanks in adance!

Discussion

  • Igor Pavlov

    Igor Pavlov - 2020-03-18

    7-Zip can try different ways to open zip archive, if there is error in headers.
    Please try latest 7-Zip 20.00 alpha via WINE.

    Also you can try to create some small example archive for debug and upload it here. You can use some big files with zeros for better compression zero.

     
    • Johan van der Knijff

      Hi Igor,

      Thanks for getting back to this. I gave the 20.00 alpha version a try in WINE, which is able to deal with this file without any problems:, see output below:

      7-Zip 20.00 alpha (x64) : Copyright (c) 1999-2020 Igor Pavlov : 2020-02-06
      
      Scanning the drive for archives:
      1 file, 5368710074 bytes (5121 MiB)
      
      Extracting archive: onedrive-zip-test-zeros.zip
      --         
      Path = onedrive-zip-test-zeros.zip
      Type = zip
      Physical Size = 5368710074
      64-bit = +
      Characteristics = Local Central Zip64
      
      Everything is Ok                           
      
      Files: 5
      Size:       5368709120
      Compressed: 5368710074
      

      It would be really useful if the changes in the 20.0 alpha version could also be incorporated in this p7zip port.

      I've also created an openly-licensed test file, which you can get from this link:

      https://zenodo.org/record/3715394/files/onedrive-zip-test-zeros.zip?download=1

      Note that this is a 5 GB ZIP archive, even though the files inside it only contain null bytes. Apparently OneDrive only uses ZIP as a container format without actually compressing anything, which is a bit weird. (If I compress those files locally I end up with a 6 MB ZIP.)

       
  • Igor Pavlov

    Igor Pavlov - 2020-03-18

    Please compress big zip archive with 7-zip and then upload small archive here.

     
    • Johan van der Knijff

      Sure, here it is!

       
      • Igor Pavlov

        Igor Pavlov - 2020-03-19

        Thanks!

        I'll change 7-Zip code.
        And 7-Zip will ignore that minor error. So 7-Zip will open such archives faster than 20.00 alpha.

         
        • Johan van der Knijff

          Fantastic, thanks for that!

           
  • Igor Pavlov

    Igor Pavlov - 2020-03-18
    Characteristics = Local Central Zip64
    

    is not good for 100%.
    It means that fast open operation from Central Directory was aborted and 7-Zip used slow open operation from local headers.
    But it's still good that 7-Zip opens archive fully.

     

Log in to post a comment.