Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#11 Does not convert non-ASCII filenames consistently with files

open
5
2006-08-09
2006-08-09
Naoaki Okazaki
No

On a VFAT filesystem the "long" filenames are stored in
UCS-2 or
possibly UTF-16. On a Linux system the VFS converts
them to and from a
multibyte encoding specified by a mount option
(iocharset) or ISO 8859-1
by default (this is the wrong default, but it's a bit
late to change
it). easyh10 converts these filenames back to UCS-2 as
if they are in
the current locale's character encoding. If this does
not match the
encoding for the mount, the round-trip corrupts
non-ASCII characters.

easyh10 should check the filename encoding for the
directory that it
indexes, and then:
(1) treat all filenames under that as using that
encoding
and/or (2) warn if it doesn't match the locale

* Originally submitted from Ben Hutchings via Benjamin.

Discussion