#29 Encoding detection for ZIP archives

open
nobody
None
5
2010-07-16
2010-07-16
No

There is a major problem with ZIP archives containing non-Latin names inside. The file names are stored in different encodings depending on the system where the zip file was created. For example, for Russian language Windows archivers are using CP1251, old Linux systems KOI8-R, and it is UTF-8 on most modern Linux desktops. That makes impossible to transfer such files between different systems. Especially, problematic is link between Windows and Linux.

There is a http://RusXMMS.sf.net project aimed to handle the problem. It provides libraries to auto-detect and fix unknown encodings and a set of patches implementing the library support in several applications. There is patch against latest version of p7zip. It would be really great to see it upstream.

RusXMMS have a long history, well tested, and included in some of the major Linux distributions including Debian/Ubuntu and OpenSuSe.

Discussion