From: Franky <lie...@pa...> - 2004-05-26 20:28:33
|
Hi all, lately there have been some character encoding problems with UTF-8 languages, but also Korean, Russian, etc ... Therefore I have a suggestion: - convert all language files to UTF-8 - use only UTF-8 as the character set in the generated html files - change all calls to htmlspecialchars and htmlentities so they include the third parameter (charset) and set that to UTF-8 (and now the discusion issues): - use the mb_* php functions (see http://be.php.net/manual/en/ref.mbstring.php) or: use the mbstring.func_overload parameter in php.ini (less code changes required) - set the internal encoding to UTF-8 (mbstring.internal_encoding) - set transparent encoding for incoming html requests (mbstring.encoding_translation) or use |mb_convert_encoding| - use mb_http_output()* *to force html output to UTF-8 characterset ==> for these to work, the mbstring function calls need to be implemented of course, otherwise we can do something like tikiwiki (define own mb_* functions, but they only created one, and it's difficult to create these functions) ==> using iconv maybe also a possibility Of course more subtle changes are required for webmail, where you need to define the character encoding in html mails, but I hope that will be easier to do ... if needed. In all of this, I could use the assistance from somebody with a little more knowledge in charactersets, so he/she can confirm that what I'm saying is correct or not ... Franky |