Menu

#409 Some Japanese file names are garbled with v.21.06

open
nobody
None
5
2022-12-06
2021-12-02
No

When extracting a compressed file using v.21.06, some Japanese file names are garbled.
Even if the same file is extracted with v19.00, garbled characters do not occur.If you extract the file using the standard OS functions, the garbled characters will not appear..
Operating environment: Windows 10 21H1 64 bit, Windows Server 2016

2 Attachments

Discussion

  • Igor Pavlov

    Igor Pavlov - 2021-12-02

    If zip archive uses unix os system in properties, new 7-zip expects utf-8 encoding instead of OEM (DOS) encoding.

     
  • Masakazu OIWA

    Masakazu OIWA - 2021-12-03

    I would like to extract compressed files received from our customers without garbled characters as before.
    Is it possible to choose the encoding when extracting?
    Or is there any other good way?
    If not, I will unfortunately have to continue using v19.00.

     
  • Igor Pavlov

    Igor Pavlov - 2021-12-03

    Now only command line version allows to set code page to OEM for extracting operation for such archives:

    7z x a.zip -mcp=1
    

    1 : CP_OEMCP is oem code page
    0 : CP_ACP is the system default Windows ANSI code page

     

    Last edit: Igor Pavlov 2021-12-03
    • Astha

      Astha - 2022-05-18

      I'm also facing an issue with Chinese file name.

      I compressed a doc file with a Chinese name using the macOS default compress option and used that Chinese name zip file as input to this library through an Android Application.
      The issue which I am facing is on extracting the file, the file name is distorted/garbled.

      Also, I used a macOS Application named 'Keka' for compressing the file second time and used that zip file as input to this library and this time the file name is all correct.

      How can I handle the cases like the first one?
      In real-world, the user will be uploading the zip files so I do not have control over compression.

      Let me know if you need more details on this.
      I'm really stuck

       
  • Masakazu OIWA

    Masakazu OIWA - 2021-12-10

    It is difficult for end users to use the command line.
    Although the language is different, a case similar to this one seems to have been posted on the forum (Filenames with èòàìù; Created: 2021-11-26, Updated: 2021-11-27).
    I hope that in the next version, the same encoding method as in v.19.00 will be available in the GUI options, as I'm sure other users would like it.

     
    • AmigoJack

      AmigoJack - 2021-12-22
       
    • Astha

      Astha - 2022-05-18

      How did you handle it? I'm also facing similar issue

       
  • Astha

    Astha - 2022-05-18

    Hey, you found any solution for this?
    I'm also stuck with the same issue

     
  • hol213

    hol213 - 2022-11-11

    Hello,

    We have colleagues in Japan using 7-zip and they are reporting issues with corrupted characters.

    Please can I check if there is a GUI solution to this issue, or the ETA on a fix?

    They are running '7-Zip 22.01 (x64 edition)'

     

    Last edit: hol213 2022-11-11
    • Igor Pavlov

      Igor Pavlov - 2022-11-11

      describe full situation:
      What program was used to create archive?
      What program was used to extract archive?
      Open archive in 7-zip, select the file inside and press Info button for properties.

       
  • hol213

    hol213 - 2022-11-15

    Thanks Igor for your help, I am awaiting clarification from my colleague in Japan.

     
  • hol213

    hol213 - 2022-11-16

    Hello Igor,

    Please can we clarify on your question 'What program was used to extract archive?'
    - We are using 7-Zip 22.01 (z64) 2022-07-15 to extract the archive
    - Are you referring to the decompression method?

     
    • hol213

      hol213 - 2022-11-22

      Hello Igor,

      I have opened the 7-zip file, here is the Info:

      Name: RSC TIC 11îÄÄæù┐.docx
      Folder: -
      Size: 2 246 687
      Packed Size: 2 097 008
      Modified: 2022-10-26 17:31:00
      Attributes: A
      Encrypted: -
      CRC: 8B868A94
      Method: Deflate
      Host OS: FAT
      Version: 20
      Volume Index: 0
      Offset: 0
      ------------------------:
      Name: RSC_TICæµ7ë±Äæù┐\
      Size: 2 305 630
      Packed Size: 2 148 737
      Folders: 0
      Files: 3
      CRC: AA4183E8
      ------------------------:
      Path: C:\Users\xxxx\Downloads\RSC_TIC第7回資料.zip
      Type: zip
      Physical Size: 2 149 285
      ------------------------:
      ------------------------:

       

      Last edit: hol213 2022-11-22
      • Igor Pavlov

        Igor Pavlov - 2022-11-22

        What program was used to create that zip archive?

         
        • hol213

          hol213 - 2022-11-23

          We are still waiting to hear back from the sender.

           
          • hol213

            hol213 - 2022-11-24

            We have two different people sending zip files:

            // This one has issues //
            Details for the archive including the string OPPO are as below.
            Operating system: Mac OS Big Sur
            OS version: 11.7.1
            Archiving software: ZIPANGFULL
            Archiving software version: 1.1

            Additional to request - at remote site
            Extraction software: The Unarchiver
            Extraction software version: 4.3.5

            //This one works fine //
            OS: Windows10
            OS ver: 21H2
            Archive soft: Zip. (Zip mail)
            Archive soft ver: 16.1.7.0 (ZipMail for MS Outlook)

             

            Last edit: hol213 2022-11-24
            • hol213

              hol213 - 2022-12-05

              Hello @Igor,

              Is there any news on this thread please?

              Many thanks

               
              • Igor Pavlov

                Igor Pavlov - 2022-12-05

                So 7-Zip shows incorrect name for that archive?
                But does internal zip program in Windows extract that archive to correct name?

                 
                • hol213

                  hol213 - 2022-12-06

                  Thanks Igor,

                  Yes 7-zip shows an incorrect name and sub folders for that archive.

                  I have tested on the example zip files provided to me from my colleague in Japan and the character issue happens on both Windows' built in extraction and in 7-zip.

                  I have asked my colleague in Japan to confirm on any other files received with the same issue.

                   
  • Masakazu OIWA

    Masakazu OIWA - 2022-11-21

    Hello Igor,
    Sorry it took so long.

    I inform you of the application, etc. as below, regarding the compressed files received from our customer.


    OS: Amazon Linux2
    Application: ZipArchive
    Version: 1.15.6


    When a file compressed under the above conditions is decompressed with v.21.06 or later, garbled characters are generated.

     
    • Igor Pavlov

      Igor Pavlov - 2022-11-21

      Please attach any example of such zip archive.

       
  • Masakazu OIWA

    Masakazu OIWA - 2022-12-06

    Hello Igor,
    Sorry for the delay.

    I'm sorry for not sending the files that cause the problem with customer-supplied data.

    7-zip shows incorrect name for that archive.
    But internal zip program in Windows extract that archive to correct name.

    It seems that the Japanese half-width kana file names cause problems.
    7-zip v.19 causes no problems.

     
    • Igor Pavlov

      Igor Pavlov - 2022-12-06

      There are different ways to implement encoding in zip software.
      Latest 7-Zip version uses the scheme that allow to decode zip archives that were encoded in linux for extracting at linux by zip program.
      But some software encodes file in linux, but they care decoding only in Windows. And there is problem in that case, if they use some unusual values in headers.
      That problem can be fixed in program that creates archive . They can use utf-8 encoding instead of old dos (OEM) schemes.

       

Log in to post a comment.