From: Emmanuel E. <emm...@en...> - 2010-05-25 17:17:14
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Le 16/05/2010 17:34, Emmanuel Engelhart a écrit : > Hi, > > first of all I'm really a newbie with ICU and I'm sorry if the response > to my question is trivial. > > I work on a software which runs on GNU/Linux and Windows. > > I use the UnaccentTransliterator given as example here: > http://icu-project.org/repos/icu/icu/trunk/source/samples/translit/unaccent.cpp > > My problem is that the codes work on my GNU/Linux computer, but not with > Windows: > > My test word is "zürich" > > My code is: > > printStringInHexadecimal(text.c_str()); > unicodeAccentedString = UnicodeString(text.c_str()); > printStringInHexadecimal(unicodeAccentedString); > unaccent.transliterate(unicodeAccentedString); > printStringInHexadecimal(unicodeAccentedString); > text.clear(); > unicodeAccentedString.toUTF8String(text); > printStringInHexadecimal(text.c_str()); > > On GNU/Linux I have what I want: > > z 0xc3 0xbc r i c h > z 0xfc r i c h > z u r i c h > z u r i c h > > But On Windows > > z 0xc3 0xbc r i c h > z 0xc3 0xbc r i c h > z A 0xbc r i ch > z A 0xc2 0xbc r i c h > > It's seems that the normalizer or/and the transliterator depend on the > locale/OS... but I can't find out what is wrong. > > Any idea? > > Emmanuel Engelhart I have fixed the issue by specifying the default locale. My code: UErrorCode status = U_ZERO_ERROR; Transliterator *trans = Transliterator::createInstance("Lower; NFD; [:M:] remove; NFC", UTRANS_FORWARD, status); ucnv_setDefaultName("UTF-8"); UnicodeString ustring = UnicodeString(text.c_str()); trans->transliterate(ustring); text.clear(); ustring.toUTF8String(text); return text; Regards Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkv8BdAACgkQn3IpJRpNWtPaUgCeNQmnCniq871bkdtooQCdOl90 /4gAoKsdYyauVl4kX718IXOc8oK2ouaA =WQpu -----END PGP SIGNATURE----- |