Menu

7zip 21.07 and 19.00 process same zip archive differently.

Ariman
2022-02-06
2022-02-12
  • Ariman

    Ariman - 2022-02-06

    Hi.

    I have zip file that seems to be somewhat non-standard. I'm no expert with the format but it looks like it has Unix flag for entries but DOS backslashes as path delimiter.

    7zip 19.00 processes this file without any issues:

    7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21
    
    Scanning the drive for archives:
    1 file, 162127 bytes (159 KiB)
    
    Listing archive: i4j.zip
    
    --
    Path = i4j.zip
    Type = zip
    Physical Size = 162127
    
       Date      Time    Attr         Size   Compressed  Name
    ------------------- ----- ------------ ------------  ------------------------
    2010-06-08 09:44:16 D....            0            0  samples\
    2010-06-08 09:44:16 D....            0            0  samples\customCode\
    2010-06-08 09:44:16 D....            0            0  samples\customCode\media\
    2010-06-08 09:44:16 .....           60           60  samples\customCode\media\README.txt
    2010-06-08 09:44:16 D....            0            0  samples\customCode\src\
    2010-06-08 09:44:16 .....         2213         2213  samples\customCode\src\SampleScreen.java
    2010-06-08 09:44:16 .....         2636         2636  samples\customCode\src\SampleActionBeanInfo.java
    

    but 21.07 can not handle it properly and does not recognize path delimiters

    7-Zip 21.07 (x64) : Copyright (c) 1999-2021 Igor Pavlov : 2021-12-26
    
    Scanning the drive for archives:
    1 file, 162127 bytes (159 KiB)
    
    Listing archive: i4j.zip
    
    --
    Path = i4j.zip
    Type = zip
    Physical Size = 162127
    
       Date      Time    Attr         Size   Compressed  Name
    ------------------- ----- ------------ ------------  ------------------------
    2010-06-08 09:44:16 D....            0            0  samples_
    2010-06-08 09:44:16 D....            0            0  samples_customCode_
    2010-06-08 09:44:16 D....            0            0  samples_customCode_media_
    2010-06-08 09:44:16 .....           60           60  samples_customCode_media_README.txt
    2010-06-08 09:44:16 D....            0            0  samples_customCode_src_
    2010-06-08 09:44:16 .....         2213         2213  samples_customCode_src_SampleScreen.java
    2010-06-08 09:44:16 .....         2636         2636  samples_customCode_src_SampleActionBeanInfo.java
    

    Is it possible to revert the behavior back to how it was in19.00?
    I've attached sample file if it is needed.

     
  • Vladimir Surguchev

    I think that is the same problem -- https://sourceforge.net/p/sevenzip/bugs/2312/
    At least adopted arclite (https://forum.farmanager.com/viewtopic.php?p=169196#p169196)
    opens your file without problem with 7z.dll 21.07.

     
  • Igor Pavlov

    Igor Pavlov - 2022-02-07

    Linux suports backslashes in names.
    So it's allowedd to create zip archive with such files in linux.
    And 7-Zip supports extracting for such files in linux and windows.
    when we extract such file in windows, we don't want to create wrong directory structure.
    it's rare case where backslash in zip with linux mark represents path separator.

     

    Last edit: Igor Pavlov 2022-02-07
  • Ariman

    Ariman - 2022-02-07

    As you can see in the listing 21.07 does not recognize directories in the file path. It thinks that all files have long names with entire path in it. 7zip 19.0 and winrar process such archives properly.

    This file was created by Install4j installer. It stores installer content in similar zips.

     
  • Igor Pavlov

    Igor Pavlov - 2022-02-07

    That zip was created by incorrect code.
    So that zip archive looks like correct archive created in linux that contains backslashes in file names.
    If we extract such file in linux, it will not create dir folder. It creates just one file dir\file. So 7-Zip in Windows also creates just one file without dir folder for such file. So 7-zip is consistent for any system that is used for extraction.

    Write to developers of zip software that was used to create that zip archive and ask them to fix their code.

     

    Last edit: Igor Pavlov 2022-02-07
  • Ariman

    Ariman - 2022-02-07

    On Linux backslash is a valid character for file names. So unzip has no reason not to use it as such. But on Windows it is a path delimiter and can not be used in names. It is more logical to use it as path delimiter. Most Win software uses both slashes as equal delimiter.

    Besides if you insist on using path as filename why does 7zip replace backslash with some strange Unicode symbol? In console output it looks like underscore but it is not. See attached image. Consolas font even does not have a glyph for it. It doesn't look like intended replacement behavior but more like a bug.

     
  • Igor Pavlov

    Igor Pavlov - 2022-02-07

    think about such situation:
    user in linux creates such file:

    name1\name2.txt
    

    then user compresses that file to zip and uploads zip to site.
    Then two users extract that zip archive.
    user-1 extracts zip in Linux
    user-2 extracts that zip in Window.
    User 1 in linux correctly sees original file name1\name2.txt.
    But user in Windows can see some unexpected directory name1.
    So windows user will get directory name1 at extraction, while original source has no directories at all.
    To avoid such incosistancy, 7-Zip replaces backslash character with Unicode WSL replacement character for backslash.
    And in WSL you will see original \ character.

    But such incorrect zip archives are rare cases.
    And best solution to solve the problem is to fix wrong software that was used to create such incorrect zip files.
    Think it so. 7-Zip can show that archive has errors. And it allows another zip software developers to keep better compatibility with zip standards.

     

    Last edit: Igor Pavlov 2022-02-07
  • Ariman

    Ariman - 2022-02-07

    think about such situation:
    user in linux creates such file:

    If somebody makes archive like this manually I don't think he intends it to be multi-platform.

    But such incorrect zip archives are rare cases.
    And best solution to solve the problem is to fix wrong software that was used to create such incorrect zip files.

    Archive that I've uploaded is not a result of bug in software. It was deliberately created this way.

    As I've mentioned earlier it is from Install4j installer. Install4j packs installable files in similar zips and attaches them to bootstrapper. Since it was installer made for Windows then all paths inside have backslashes as path delimiter (don't know why they set Unix flag in zip).

     
  • Vladimir Surguchev

    That is definitely bad archchive, but:
    1) such archives are not rare case, archives that contains real '\' as part of filename are much more rare case.
    2) there are a lot of such archives already, so new behaviour is not the best solution.
    3) all the problem archives that i saw has HostOS=MSDOS in LocalFileHeader ('PK\x03\x04') and HostOS=Unix in CentralFileHeader ('PK\x01\x02')

    Maybe it makes sense to leave Unix purism only for files where LocalFileHeader HostOS=Unix?

     

    Last edit: Vladimir Surguchev 2022-02-08
  • Igor Pavlov

    Igor Pavlov - 2022-02-08

    Archive that I've uploaded is not a result of bug in software. It was deliberately created this way.

    It's bug in creation software.
    ZIP specification:

           4.4.17.1 All slashes MUST be forward slashes '/' as opposed to
           backwards slashes '\' for compatibility with Amiga
           and UNIX file systems etc.
    

    The problem that some zip software ignores some specification rules. And the developers of bug software do not know that they create incorrect archives.
    Now 7-Zip highlights that problem, and anoher zip developers can see problem cases. And they can fix their software.
    That is why strict checks in 7-Zip are good. It allows another zip software developers to make more correct software.
    Please report Install4j developers about that problem, if Install4j was used to create that zip . And check latest version of that software. Maybe they already have fixed it.

     
  • Ariman

    Ariman - 2022-02-08

    There is nothing to report. It is not a bug. They do it on purpose. Install4j does not intend for these zips to be extracted regular way. They are part of the installer blob.
    I've created small tool to extract .zips from installer .exe file and wanted to use 7zip to further open zips to extract individual files. But I was unable to do so with new version.

    I don't agree with you about following strict specification in this case. I doubt there are a lot of files where backslash is really intended to be a part of the file name. But your code - your rules, no point in more arguing about it. Thanks for the response.

     
  • Igor Pavlov

    Igor Pavlov - 2022-02-08

    So you think that Install4j by design breaks zip specification rules?
    I don't think so.
    I suppose that the Install4j developers thought that their solution is correct, while another zip software doesn't show any error message.
    Now when 7-Zip highlights that problem, they can change their mind and make the code more compatible with zip specification rules.

    If some software uses any format (like ZIP) in internal pusposes (internal software for encoding and decoding), they can change any header field of that format, for example, they can change the order of compressed_size and uncompressed_size fields. It will break compatibility. But most developers do not do it. They try to follow standards. And when someone shows that they do not follow standard, they fix problem code.

     
  • Igor Pavlov

    Igor Pavlov - 2022-02-08

    The questions to Install4j developers is so:
    Why do they use backslash, while all another zip programs use slash as directory separator?
    So why their code breaks zip specification rules and create archives that differ from zip archives created by all another zip programs?
    What the reason?
    Why same reasons didn't work in any another zip program?

    I suppose that they just didn't know that there is some problem there in their code. So if they know about problem, they can just fix it.

     

    Last edit: Igor Pavlov 2022-02-08
  • Ariman

    Ariman - 2022-02-08

    But it is not a zip program. They don't create zip files for user consumption. It is installer creation tool, similar to NSIS and Inno.
    Install4j installer file is not standard zip sfx. They pack installable files in one or more zips then compress zips with lzma and attach as pe overlay. From the outside view installer .exe file does not have any signs of zip format.

    I guess they took zip format without any changes for simplicity. But I doubt that they had any intent to fix compatibility issues with standard zip tools.

    I think they took backslashes here on purpose because this particular zip is from installer made for Windows. Install4j is multi-platform installer, they pack programs for all platforms. My guess is they use OS-dependent path delimiters in each case.

     
    • MichaIng

      MichaIng - 2022-02-12

      But it is not a zip program. They don't create zip files for user consumption. It is installer creation tool, similar to NSIS and Inno.

      If it's not a zip file, why do they give it a zip extension, and why do you want 7-Zip to be able to handle it as zip file? There is a reason for why standards exist, to be robust and failsafe across different platforms. The more you bend a standard, the more it looses its purpose. There are other cases where non-standard archives are used by software, like OVA, which is actually a gzipped tarball, but they intentionally give it a different extension to not be bound to the standard, since it is intended to be imported by virtualizers only. If Install4j creates intentionally non-standard zip archives, they could give it a different file extension to make clear that it is not intended to be opened by archivers, which again makes the point clear that 7-Zip does everything exactly correct here.

       
      • Ariman

        Ariman - 2022-02-12

        If it's not a zip file, why do they give it a zip extension

        It is not a standalone file, it does not have an extension. If you attach sfx module to standard zip file it will loose extension but it is zip file still.

        why do you want 7-Zip to be able to handle it as zip file

        Because besides backslashes it is zip file.

         
  • Igor Pavlov

    Igor Pavlov - 2022-02-08

    Try to contact them and report about that problem.

     

Log in to post a comment.