Menu

#67 Failure to properly handle archives with backward slashes under Linux

v1.0 (example)
closed-invalid
nobody
8
2022-04-15
2022-04-02
No

unzip doesn't properly handle archives with backward slashes under Linux.

Example archives:

https://github.com/felixrieseberg/macintosh.js/releases/download/v1.1.0/macintosh.js-win32-ia32-1.1.0.zip

https://sourceforge.net/projects/screenruler/files/v.0.8.1/ScreenRuler-v.0.8.1-Portable.zip/download

If you try to unpack any of them, you'll get funny results.

Discussion

  • Steven Schweda

    Steven Schweda - 2022-04-02

    unzip [...]

    As always, a program version number would be helpful, but it probably
    doesn't matter much in this case.

      unzip -v
    

    [...] doesn't properly handle archives with backward slashes under Linux.

    I just ran into this problem in a Ford/Microsoft SYNC software update
    kit a few days ago.

    If you try to unpack any of them, you'll get funny results.

    That depends on your sense of humor.

    One could argue about what "properly" means in this case. Such
    archives do not conform to the zip archive standard, which requires a
    forward slash ("/"), not a backward slash ("\") as the path separator.
    See section "4.4.17 file name" in:

      https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
    

    It would be interesting to know exactly how such non-compliant
    archives are created. (Our Zip program does it properly when it's used
    on Windows, for example.)

    The (Apple-supplied) Archive Utility app on Mac behaves the same way
    with such non-compliant archives. (Ford tells its victims with Macs to
    download and use a (free) third-party app, "The Unarchiver", which works
    for Ford's non-compliant software update kits. And to disable Safari's
    default behavior, which is to extract the contents of a zip archive
    automatically when one is downloaded, using built-in tools, which don't
    compensate for a non-compliant archive. It's a mess.)

    The current behavior of UnZip (or Apple's Archive Utility app) makes
    some sense, because "\" is a legal (albeit unpopular) character in a
    file name on a Unix-like file system, so it would be an error always to
    convert "\" to "/" on any Unix-like system. Both programs comply with
    the standard. The program which made these non-compliant archives did
    not.

    One possible solution would be to enhance UnZip to convert "\" to "/"
    in the (path) names of archive members (in a non-compliant archive) when
    the archive member is marked with a source-file-or-operating-system like
    MS-DOS, or another system where "\" is the normal directory separator
    character. (See section 4.4.2.2 in the APPNOTE.) There would need to
    be a command-line option to enable/disable this behavior.

    I don't see any quick-and-easy work-around. I might write a script
    which renames a file with "\" in its name (and creates the required
    directories).

    You might also complain to the people who distribute the
    non-compliant archives.

     
    • Artem S. Tashkinov

      As always, a program version number would be helpful, but it probably doesn't matter much in this case.

      unzip-6.0-53.fc35.x86_64 : https://src.fedoraproject.org/rpms/unzip/tree/rawhide

      One possible solution would be to enhance UnZip to convert "\" to "/" in the (path) names of archive members

      In my 25 years of using UNIX OSes I've not seen a single filename containing back slashes. I'd just go ahead and convert \ in filenames to / automatically but I'm not a C programmer.

      You might also complain to the people who distribute the non-compliant archives.

      It's a lost cause. Lots of these archives are not maintained, their creators wouldn't bother.

      I've "solved" the problem for myself by using WinRAR under Wine. Nothing native under Fedora (Linux) can properly unpack such archives.

       

      Last edit: Artem S. Tashkinov 2022-04-03
  • Steven Schweda

    Steven Schweda - 2022-04-03

    The current behavior of UnZip (or Apple's Archive Utility app) makes
    some sense, [...]

    More than I had realized. I apologize. Initially, I was too lazy to
    do the extraction; I just looked at a listing ("unzip -l"), and saw the
    backslashes. "Trust no one," I always say. So, today, I actually did
    the work, and noticed that UnZip apparently already handles this
    situation, which my tired, old brain had forgotten. Around here, on a
    Mac, for example:

    proa$ mkdir SR
    proa$ cd SR
    proa$ unzip6 ../ScreenRuler-v.0.8.1-Portable.zip
    Archive:  ../ScreenRuler-v.0.8.1-Portable.zip
    warning:  ../ScreenRuler-v.0.8.1-Portable.zip appears to use backslashes as path separators
      inflating: ScreenRuler-v.0.8.1-Portable/LICENSE.txt  
      inflating: ScreenRuler-v.0.8.1-Portable/README.html  
      inflating: ScreenRuler-v.0.8.1-Portable/screenruler.exe  
      inflating: ScreenRuler-v.0.8.1-Portable/screenruler.exe.config  
      inflating: ScreenRuler-v.0.8.1-Portable/de/screenruler.resources.dll  
      inflating: ScreenRuler-v.0.8.1-Portable/es/screenruler.resources.dll  
      inflating: ScreenRuler-v.0.8.1-Portable/fi/screenruler.resources.dll  
    [...]
    
    proa$ ls -l
    total 0
    drwxr-xr-x  20 sms  staff  640 Apr  3 14:20 ScreenRuler-v.0.8.1-Portable
    

    The rest of the directory hierarchy looks fine, too.

    That "unzip6" executable was built from the original UnZip 6.00
    source. The "usr/bin/unzip" which Apple ships with the OS works the
    same.

    It's similar for your other example (although you don't get the
    warning until the first backslash gets processed, and that's later in
    this case):

    proa$ mkdir MAC
    proa$ cd MAC
    proa$ unzip6 ../macintosh.js-win32-ia32-1.1.0.zip
    Archive:  ../macintosh.js-win32-ia32-1.1.0.zip
      inflating: chrome_100_percent.pak  
      inflating: chrome_200_percent.pak  
    [...]
      inflating: vulkan-1.dll            
    warning:  ../macintosh.js-win32-ia32-1.1.0.zip appears to use backslashes as
     path separators
      inflating: locales/am.pak          
      inflating: locales/ar.pak          
    [...]
    

    If you try to unpack any of them, you'll get funny results.

    I think that these results are pretty funny, but perhaps it would be
    helpful if you provided a serious problem report, showing exactly what
    you did, and exactly what happened when you did it.

    In my 25 years of using UNIX OSes [...]

    All that experience, and "you'll get funny results" is your idea of a
    problem description?

     
  • Artem S. Tashkinov

    I need to apologize.

    unzip under Fedora 35 unpacks both archives with zero issues

    The issue is Midnight Commander which tries to parse unzip -l output and fails spectacularly: https://midnight-commander.org/ticket/4238

    Still, I've identified a bug:

    $ unzip macintosh.js-win32-ia32-1.1.0.zip 'swiftshader\libEGL.dll'
    Archive: macintosh.js-win32-ia32-1.1.0.zip
    caution: filename not matched: swiftshader\libEGL.dll

    $ unzip macintosh.js-win32-ia32-1.1.0.zip 'swiftshader/libEGL.dll'
    Archive: macintosh.js-win32-ia32-1.1.0.zip
    caution: filename not matched: swiftshader/libEGL.dll

    1. So, there's a bug: files in archive directories cannot be extracted by mask.
    2. There's a feature request: replace \ with / in the output.
     

    Last edit: Artem S. Tashkinov 2022-04-05
  • Steven Schweda

    Steven Schweda - 2022-04-05

    I need to apologize.

    Ok.

    unzip under Fedora 35 unpacks both archives with zero issues

    I can believe that.

    1. So, there's a bug: files in archive directories cannot be extracted
      by mask.

    Or, it's not a bug in UnZip, but you didn't treat "\" as a special
    character? Try "\\" instead of "\"? Using your other (smaller) example
    archive:

    proa$ ln -s ScreenRuler-v.0.8.1-Portable.zip SR.zip
    
    proa$ unzip -l SR.zip 'ScreenRuler-v.0.8.1-Portable\README.html'
    Archive:  SR.zip
      Length      Date    Time    Name
    ---------  ---------- -----   ----
    ---------                     -------
            0                     0 files
    
    proa$ unzip -l SR.zip 'ScreenRuler-v.0.8.1-Portable\\README.html'
    Archive:  SR.zip
      Length      Date    Time    Name
    ---------  ---------- -----   ----
         5578  01-31-2021 21:47   ScreenRuler-v.0.8.1-Portable\README.html
    ---------                     -------
         5578                     1 file
    

    It works for actual extraction, too:

    proa$ unzip SR.zip 'ScreenRuler-v.0.8.1-Portable\README.html'
    Archive:  SR.zip
    caution: filename not matched:  ScreenRuler-v.0.8.1-Portable\README.html
    
    proa$ unzip SR.zip 'ScreenRuler-v.0.8.1-Portable\\README.html'
    Archive:  SR.zip
    warning:  SR.zip appears to use backslashes as path separators
      inflating: ScreenRuler-v.0.8.1-Portable/README.html  
    

    I wouldn't bet that the UnZip documentation mentions this, but it's
    pretty standard behavior on "UNIX OSes". Your apostrophes get your
    backslash past the shell, but UnZip still needs to deal with it, and
    it's special to UnZip, too. Another (appostrophe-free) possibility:

    proa$ unzip -l SR.zip  ScreenRuler-v.0.8.1-Portable\\\\README.html 
    Archive:  SR.zip
      Length      Date    Time    Name
    ---------  ---------- -----   ----
         5578  01-31-2021 21:47   ScreenRuler-v.0.8.1-Portable\README.html
    ---------                     -------
         5578                     1 file
    

    As usual, many things are possible.

    1. There's a feature request: replace \ with / in the output.

    In which "the output"? A path in the (non-standard-compliant)
    archive might include "\" separators, but a name in the local
    (UNIX-like) file system will include "/" separators. On a VMS system,
    for example, the in-archive names are the same, but names in the local
    file system look still different:

    its $ create SR.zip /symlink = ScreenRuler-v.0.8.1-Portable.zip 
    
    its $ unzip6l -l SR.zip ScreenRuler-v.0.8.1-Portable\\README.html
    Archive:  ITS$DKA0:[SMS.IZ.test_sf67_backslash]SR.zip;1
      Length      Date    Time    Name
    ---------  ---------- -----   ----
         5578  01-31-2021 21:47   ScreenRuler-v.0.8.1-Portable\README.html
    ---------                     -------
         5578                     1 file
    
    its $ unzip6l SR.zip ScreenRuler-v.0.8.1-Portable\\README.html
    Archive:  ITS$DKA0:[SMS.IZ.test_sf67_backslash]SR.zip;1
    warning:  ITS$DKA0:[SMS.IZ.test_sf67_backslash]SR.zip;1 appears to use
     backslashes as path separators
      inflating: [.ScreenRuler-v^.0^.8^.1-Portable]README.html  
    

    What, exactly, don't you like about what UnZip does? And what,
    exactly, would you like it to do, instead?

     

    Last edit: Steven Schweda 2022-04-06
  • Steven Schweda

    Steven Schweda - 2022-04-09

    https://midnight-commander.org/ticket/4238

    It's an upstream bug [...]

    Not on our tributary. It might be helpful if someone explained the
    actual situation in that other forum.

     
  • Steven Schweda

    Steven Schweda - 2022-04-15
    • status: open --> closed-invalid
     

Log in to post a comment.