Some Japanese file names are garbled with v.21.06
A free file archiver for extremely high compression
Brought to you by:
ipavlov
When extracting a compressed file using v.21.06, some Japanese file names are garbled.
Even if the same file is extracted with v19.00, garbled characters do not occur.If you extract the file using the standard OS functions, the garbled characters will not appear..
Operating environment: Windows 10 21H1 64 bit, Windows Server 2016
If zip archive uses unix os system in properties, new 7-zip expects utf-8 encoding instead of OEM (DOS) encoding.
I would like to extract compressed files received from our customers without garbled characters as before.
Is it possible to choose the encoding when extracting?
Or is there any other good way?
If not, I will unfortunately have to continue using v19.00.
Now only command line version allows to set code page to OEM for extracting operation for such archives:
1 : CP_OEMCP is oem code page
0 : CP_ACP is the system default Windows ANSI code page
Last edit: Igor Pavlov 2021-12-03
I'm also facing an issue with Chinese file name.
I compressed a doc file with a Chinese name using the macOS default compress option and used that Chinese name zip file as input to this library through an Android Application.
The issue which I am facing is on extracting the file, the file name is distorted/garbled.
Also, I used a macOS Application named 'Keka' for compressing the file second time and used that zip file as input to this library and this time the file name is all correct.
How can I handle the cases like the first one?
In real-world, the user will be uploading the zip files so I do not have control over compression.
Let me know if you need more details on this.
I'm really stuck
It is difficult for end users to use the command line.
Although the language is different, a case similar to this one seems to have been posted on the forum (Filenames with èòàìù; Created: 2021-11-26, Updated: 2021-11-27).
I hope that in the next version, the same encoding method as in v.19.00 will be available in the GUI options, as I'm sure other users would like it.
Welcome to the internet - please use links: https://sourceforge.net/p/sevenzip/discussion/45797/thread/dddbd68b23/
How did you handle it? I'm also facing similar issue
Hey, you found any solution for this?
I'm also stuck with the same issue
Hello,
We have colleagues in Japan using 7-zip and they are reporting issues with corrupted characters.
Please can I check if there is a GUI solution to this issue, or the ETA on a fix?
They are running '7-Zip 22.01 (x64 edition)'
Last edit: hol213 2022-11-11
describe full situation:
What program was used to create archive?
What program was used to extract archive?
Open archive in 7-zip, select the file inside and press
Info
button for properties.Thanks Igor for your help, I am awaiting clarification from my colleague in Japan.
Hello Igor,
Please can we clarify on your question 'What program was used to extract archive?'
- We are using 7-Zip 22.01 (z64) 2022-07-15 to extract the archive
- Are you referring to the decompression method?
Hello Igor,
I have opened the 7-zip file, here is the Info:
Name: RSC TIC 11îÄÄæù┐.docx
Folder: -
Size: 2 246 687
Packed Size: 2 097 008
Modified: 2022-10-26 17:31:00
Attributes: A
Encrypted: -
CRC: 8B868A94
Method: Deflate
Host OS: FAT
Version: 20
Volume Index: 0
Offset: 0
------------------------:
Name: RSC_TICæµ7ë±Äæù┐\
Size: 2 305 630
Packed Size: 2 148 737
Folders: 0
Files: 3
CRC: AA4183E8
------------------------:
Path: C:\Users\xxxx\Downloads\RSC_TIC第7回資料.zip
Type: zip
Physical Size: 2 149 285
------------------------:
------------------------:
Last edit: hol213 2022-11-22
What program was used to create that zip archive?
We are still waiting to hear back from the sender.
We have two different people sending zip files:
// This one has issues //
Details for the archive including the string OPPO are as below.
Operating system: Mac OS Big Sur
OS version: 11.7.1
Archiving software: ZIPANGFULL
Archiving software version: 1.1
Additional to request - at remote site
Extraction software: The Unarchiver
Extraction software version: 4.3.5
//This one works fine //
OS: Windows10
OS ver: 21H2
Archive soft: Zip. (Zip mail)
Archive soft ver: 16.1.7.0 (ZipMail for MS Outlook)
Last edit: hol213 2022-11-24
Hello @Igor,
Is there any news on this thread please?
Many thanks
So 7-Zip shows incorrect name for that archive?
But does internal zip program in Windows extract that archive to correct name?
Thanks Igor,
Yes 7-zip shows an incorrect name and sub folders for that archive.
I have tested on the example zip files provided to me from my colleague in Japan and the character issue happens on both Windows' built in extraction and in 7-zip.
I have asked my colleague in Japan to confirm on any other files received with the same issue.
Hello Igor,
Sorry it took so long.
I inform you of the application, etc. as below, regarding the compressed files received from our customer.
OS: Amazon Linux2
Application: ZipArchive
Version: 1.15.6
When a file compressed under the above conditions is decompressed with v.21.06 or later, garbled characters are generated.
Please attach any example of such zip archive.
Hello Igor,
Sorry for the delay.
I'm sorry for not sending the files that cause the problem with customer-supplied data.
7-zip shows incorrect name for that archive.
But internal zip program in Windows extract that archive to correct name.
It seems that the Japanese half-width kana file names cause problems.
7-zip v.19 causes no problems.
There are different ways to implement encoding in zip software.
Latest 7-Zip version uses the scheme that allow to decode zip archives that were encoded in linux for extracting at linux by zip program.
But some software encodes file in linux, but they care decoding only in Windows. And there is problem in that case, if they use some unusual values in headers.
That problem can be fixed in program that creates archive . They can use utf-8 encoding instead of old dos (OEM) schemes.