Menu

#1431 7-Zip can't extract files in InstallShield 2/3 SFX ZIP archives

open
nobody
None
5
2014-08-27
2014-08-21
quanta
No

When trying to extract some InstallShield 2/3 self-extracting archive (eg: EAX2DEMO.EXE[1]), 7-Zip reports it cannot open the file as archive. After unpacking the file with PE Explorer, processing the resulting .EOF file 7-Zip still cannot open the file as archive. In comparison, other archivers gives following responses:

  • PKUNZIP 2.50 for DOS shows the installer package contains an .ZIP archive with file named INST32I.EX and is able to extact the contents without error. When using the PKUNZIP's test feature, PKUNZIP does not report any error.
  • WinRAR is able to extract 1 file named readme.txt without error.
  • SecureZIP 9 reports 1 error/warning and unable to process the archive any further. SecureZIP does not report error type.
  • WinZip 11 reports the archive's start of central directory cannot be found and unable to process the archive. It also claims Zip file corruption and lists file transfer error as possible cause. WinZip claims it is not a Zip archive after the 1st error message.

Opening the installer package with hex editor shows the package contains 11 .ZIP archives with 1 file each, but PKUNZIP 2.50 only finds the first archive for total of 1 file, while WinRAR only finds the last archive for total of 1 file. After manually extracting all the .ZIP archives inside the installer package, 7-Zip still reports the same error message for the extracted .ZIP archives as before. Other archivers gives following responses:

  • PKUNZIP 2.50 for DOS shows the each .ZIP archive contains 1 file, and is able to extact the contents without error. When using the PKUNZIP's test feature, PKUNZIP does not report any error.
  • WinRAR is able to extract files inside .ZIP archives without error.
  • SecureZIP 9: When testing an .ZIP archive, SecureZIP reports failure for the file in the archive, and incorrectly listed tested 0 files. When trying to extract the file inside the archive, SecureZIP fails to extract the file, and reports it could not find the segment of the archive containing the file.
  • WinZip 11: When testing the .ZIP archive or trying to extract the file contained in the archive, WinZip skips the file in the .ZIP archive and returns the error message 'The general purpose flags stored in the local header for this file are not the same as the general purpose flags stored in the central header'.

To verify the true cause(s) of error, the resulting .ZIP archives from the installer package were checked SecureZIP 9. SecureZIP 9's properties dialogue shows SecureZIP found the central directory of the file, but not the local header of the file. PKUNZIP 2.50 for DOS only shows the contents of central directory for any archive, so the results cannot be trusted. To find the definite cause(s) of error, the resulting .ZIP archives were opened with hex editor. By following the .ZIP File Format Specification, it was shown that general purpose bit flag of the file is set to 0002h (bit 1 set), but is set to 0000h in central directory. After manually editing general purpose bit flag with hex editor, WinZip is able to extract and test archive contents without error, but error messages (if any) for other archivers remain the same, including 7-Zip. Further investigation shows that within each of the .ZIP archive, the 'disk number start' field value for the file record in central directory is set to 0001h. Setting the field to 0000h no longer triggers error in 7-Zip and SecureZip, even after undoing change to general purpose bit flag (except in WinZip, general purpose flags mismatch error returns). Just to be sure that is the culprit, the disk number start field value is set to 0002h. After setting the disk number start field value is set to 0002h, the errors in SecureZIP 9 and 7-Zip return, while WinZip 11 cannot open the archive and claims it didn't find end-of-central-dir signature at end of central directory. The proper values for disk number start field is not fully documented in .ZIP File Format Specification (including the Info-ZIP annotated version), which only specifies the explicitly defined value of 65535 if the archive is in ZIP64 format. The tested archives are not in ZIP64 format, so that value does not apply. However, each extracted .ZIP archive contains an end of central directory record, which includes 'number of this disk' field with value of 0000h, and 'number of the disk with the start of the central directory' field with value of 0000h. Therefore, by following those two field values in the end of central directory record, 7-Zip should not be able to locate the local header of the file specified in the central directory within the same archive.

Other than the inconsistent general purpose bit flag values between headers, and the wrong (based on disk number information fields at the end of central directories) disk number start field values, the .ZIP archives contain no errors in local headers and central directories, no errors in extracted archive contents (before and after patching .ZIP archives), no error in end of central directory records.

Suggested fixes for handling above errors as follows:

-When opening a file:
--If a file is an executable, executable reader module should always be loaded, and:
---If it contains contents that indicate exactly 1 archive exists in the file, it is treated as a self-extracting archive, and should open the file with associated archive viewer.
---If it contains contents that indicate more than 1 archive exists in the file, it is treated as a self-extracting archive, and each archive is treated as a folder inside the file in the archive viewer. When opening an embedded archive, do not assume all archives have same file format, and loads the proper archive viewer for each archive type.
---Otherwise, open the file with generic executable reader.
--Try to identify if the file contains an archive, and:
---If it contains contents that indicate exacly 1 archive exists in the file, it is treated as an archive, and should open the file with associated archive viewer..
---If it contains contents that indicate more than 1 archive exists in the file, the file is treated as a folder inside the file in the archive viewer. When opening an embedded archive, do not assume all archives have same file format, and loads the proper archive viewer for each archive type.
---Otherwise, return message that the file is not an archive.

When interpreting .ZIP archive, 7-Zip should read both local headers and central directory to catch inconsistent header values. In this case, different general purpose bit flag values are set in the local header and central directory of the file in archive. Although general purpose bit flag discrepancy does not prevent proper processing of the archives in this case, it can prevent other archive readers from processing the archives (eg: WinZip 11), and warning should be issued without preventing archive extraction.

In addition, 7-Zip needs the ability to extract archive contents from .ZIP archives that have wrong/different disk numbers for the archives themselves or for individual files specified in central directories, such as ignoring the disk numbers field in central directories, or choose a different file that may have different disk number. Whatever the options are, end users should allowed to make the choices manually. For a more permanent solution, 7-Zip needs to include more comprehensive and interactive error handlers to process archives that may be corrupted, or have wrong headers. The interactive approach allows the source of possible errors be traced more efficiently, which can lead to improve support of archive types and reduction of wrong error reports. If one particular choice does not remedy the error, user should be offered a chance to retry with other choices, skip the affected section if possible, or abort further processing. The comprehensive approach means instead of just stating errors, 7-Zip also lists the rationale that decides the types of errors exist in a file. In the case of mismatching headers, 7-Zip lists the values of both headers, and suggests valid value range(s), which can be modified temporarily if chosen by user; if there is different disk number, user can see what other files that 7-Zip needs to further process split or spanned archives.

As for support of InstallShield 2/3 SFX ZIP archives, source code for such extractor is already available in STIX[2] from Veit Kannegieser (in German).

[1] ftp://pds.tcrc.edu.tw/Hardware/multimedia/Creative/SB_Live/LiveWare3.0/eax2demo.exe
[2] http://kannegieser.net/veit/quelle/stix_src.arj

Discussion


Log in to post a comment.