From: Yi-an H. <yia...@gm...> - 2006-03-08 06:52:58
|
Hi lang2, For your list of problematic files, there are a few different cases. Case 1, digraph, arabic, hebrew, mbyte, farsi, usr_24: foreign alphabets utf-8 are designed to incorporate characters from different languages, enc-cn/cp936/gb2312/gb18030 is not, so in a few special cases some characters cannot be converted correctly and there are unfortunately no eas= y fix. In my opinion, utf-8 is the better form for translations for at least those documents that discuss digraphs and a few foreign languages. Likely, people who only read gb encodings will not care about these documents anyway, but we never know for sure... Add "-c" parameter to iconv can work around the problem and iconv would succeed by not translating these characters. But it is better for us to leave unfixable characters with question marks. So for gb translations some manual work is needed, but i would still argue that the utf-8 versions should be the primary ones. Case 2, pi_netrw, pattern: small nuisance pi_netrw.txt: the name of a French author near the end of the file pattern.txt: "a" with a hat on top of it (actually reverting to the previous version fixes the problem. but I wonder whether people reading utf-8 may feel the current way is cleaner) Case 3, various, index: incorrect characters in use and they should be fixed. netbean used to have this problem too with two unusual dashes (-), a recent update fixed it, thanks. I can upload the fixes of case 2 and 3. Willlis |