Menu

#2463 Cyrillic names inside zip

open
nobody
5
2024-06-16
2024-04-24
No

There are two zip-arhives, filenames in first one is displayed normally, second (newer one) have problems with cyrillic simbols.
The good one: https://github.com/Pr-Mex/vanessa-automation/releases/download/1.2.041.1/vanessa-automation.1.2.041.1.zip
The problematic one: https://github.com/Pr-Mex/vanessa-automation/releases/download/1.2.041.15/vanessa-automation.1.2.041.15.zip
WinRAR and TotalCommander's build-in program dispays filenames normally.
I tried on 3 different machines: win7 and two win10.
Tested on 7-Zip 23.01 x64, 7-Zip 24.04 x64, WinRAR 7.00 x64
Can't understand, it's 7-Zip fault or something wrong with PC (like missing fonts etc)?

1 Attachments

Discussion

  • Artem Sharipov

    Artem Sharipov - 2024-04-24

    WinRAR example was not attached.

     
  • Artem Sharipov

    Artem Sharipov - 2024-04-24

    Maybe the archive itself was assembled somehow incorrectly?

     
  • Igor Pavlov

    Igor Pavlov - 2024-04-24

    archive-1:

    Characteristics: NTFS up
    Host OS: FAT
    

    up means UTF-8 paths.
    So 7-Zip uses UTF-8.

    archive-2:

    Characteristics: UT:MAC:3 ux : Descriptor
    Host OS: Unix
    

    7-Zip expects that names are UTF-8, if Host OS: Unix, because UTF-8 is main encoding in linux now.
    If you unpack such archive in linux, it will use utf-8 encoding.
    If you unpack such archive in windows with 7-zip, 7-zip tries to be compatible with linux, and 7-Zip also uses utf-8 encoding. But actually archive doesn't use utf-8, and it uses DOS encoding instead. So you see incorrect characters.

    Another zip programs in Windows do not try to be so compatible with linux archives, and they can always use DOS encoding for such zip archives. That is why another zip programs can show good names in windows for that archive, but they will fail for some another zip archives created in linux.

    Good solution is so:
    ask creators of that archive to change software (or settings) that was used to create that zip file.

     

    Last edit: Igor Pavlov 2024-04-24
  • Artem Sharipov

    Artem Sharipov - 2024-04-24

    Thanks!

     
  • pikachu

    pikachu - 2024-05-16

    same here.

    in 19.00 version archive looks normal. 24.05 & 23.01 looks bad.

    we can't "ask" creators change settings for zip file, it's kind of automatic system for this operations (government procurement website)

    if it cannot be fix, we will have to roll back to the old version :(

     
    • Igor Pavlov

      Igor Pavlov - 2024-05-16

      Any website has administrators and developers. And they can try to fix the problem.
      I suppose the problem will be fixed at that website over time. You can try to help them to fix the problem sooner.

       
  • unxed

    unxed - 2024-05-27

    You can use 7zip with this patch:
    https://sourceforge.net/p/sevenzip/bugs/2473/?page=1#96ae

    to extract such archive:

    7zz -mcp=866 x ./vanessa-automation.1.2.041.15.zip

     

    Last edit: unxed 2024-05-27
  • unxed

    unxed - 2024-06-16

    So, I've done some investigation, and here's what I found out. The tar.exe that comes with Git for Windows is bsdtar, which uses libarchive. And libarchive, when creating archives, always sets the value of "the operating system on which the archive was created" to UNIX, even if the library is built and running on Windows. Consequently, on Windows systems we end up with an archive where the encoding is 866 (standard for Windows console), but the operating system value is UNIX. Therefore, many archivers do not expect encoding 866 in this context.

    I made a PR to libarchive to fix this issue:
    https://github.com/libarchive/libarchive/pull/2240

    See also:
    https://github.com/Pr-Mex/vanessa-automation/issues/2128

     

Log in to post a comment.