Hi,
p7zip doesn't extract zip files that were made on
Japanese Windows properly.
How to reproduce
1. Set your locale to en_US.UTF-8.
2. Get this file and extract it.
$ wget
http://www.geocities.jp/ep3797/snapshot/tmp/test-japanese.tar.bz2
3. Install IPAMonaPGothic.ttf (a Japanese font).
$ cd test-japanese/
$ mkdir -p ~/.fonts
$ cp IPAMonaGothic.ttf ~/.fonts
$ cd ~/.fonts
$ fc-cache -f .
$ cd -
4. Extract ja-folder.zip with p7zip.
$ 7za x ja-folder.zip
=> You'll see a broken filename.
(Check the left image of result.png)
cf. Extract ja-folder.zip with 7z.exe (Windows version).
$ cd 7-Zip/
$ wine 7z.exe x ja-folder.zip
=> You'll see a proper filename.
(Check the right image of result.png)
Logged In: YES
user_id=336051
> p7zip doesn't extract zip files that were made on
> Japanese Windows properly.
Have you tried with "unzip" (the common Unix command for
unzipping zip) ?
If you try, you will have also broken filenames ...
> ja-folder.zip
I have tried to extract files from ja-folder.zip
on Windows XP SP2 French
with
- built-in unzip feature of Windows
- winrar
- the GUI of 7-zip
As expected, all these programs create broken filenames.
The zip format can only store filenames as arrays of bytes.
This array of bytes is encoded in the current Code-Page of
your Windows
(see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp\).
As the zip format does not store the code-page in the archive,
none program can guess what code-page should be used ...
So, if you want to exchange files with filenames that are
not ASCII (english letters),
you should really use another format.
You can use "RAR" format or better the 7z format to store
filenames
in Unicode format (this encoding can store all the caracters
of the world
like japanese without the need of a code-page).
> $ wine 7z.exe x ja-folder.zip
> => You'll see a proper filename.
I think that is Ok because you have configured wine to use the
right code-page ...
p7zip like all other unzip programs cannot help you ...
So, please give up this very old format.