From: Dasn <da...@us...> - 2005-11-18 21:24:21
|
Hi, list! I just found this thread via Web and was sorry for coming back that late. > > > FYI, there"s a known glitch: some files worked by dasn (ie. > > > if_ole.txt, > > > gui_w32.txt, os_msdos.txt, etc.) couldnt be properly viewed under the > > > GBK locale. I dunno how to fix it so far. It might have something > > > to do with the (incorrect/corrupted) utf-8 conversion. > > hmmm. Can you provide a full list of files? What is the symptom? Can > > you read them at all? > > gui_w16.txt, gui_w32.txt, if_ole.txt, intro.txt, map.txt, > os_dos.txt, os_msdos.txt, os_win32.txt, sponsor.txt, > uganda.txt, windows.txt > The encoding problem is interesting: In Win32 enviroment (I do translations under Win2K), if you edit files which come from our repository with gvim DIRECTLY, you'll got all "junky characters". It seems that gvim miss some info to recognize the file encoding, as the 'fileencoding' option is null. The most tricky stuff comes when you edit the files with "notepad.exe", all things look fine in Notepad, then I "save as..." the file in UTF-8 format. Open it again with gvim, now, Vim can recognize the file encoding and auto convert to cp936. The files I had commited were formated like this. According to *help-translated* : [quote] Help files must use latin1 or utf-8 encoding. Vim assumes the encoding is utf-8 when finding non-ASCII characters in the first line. Thus you must translate the header with "For Vim version". [/quote] That means our files are either treated as latin1 encoding or utf-8 encoding when opened as [help] files, so they look fine in the help screen. What if we opened them directly? (I was used to working this way :) So, first I tried adding "enc=utf8" to the modelines of each files I translated, but as lang2 said, that was not recommended and Vim can recognize UTF-8 encoding automatically (I just wondered how? Cause all of our works in UTF-8 encoding look errant to me), then I tried to use notepad as metioned above. I appologize that the files I commited had not been tested on other platforms, and that probably bugs all my friends here. I feel so sorry. Forgive. Recently, I have to use "fileencodings=utf8" in _vimrc to force Vim to treat all kinds of files as UTF-8 encoding. The newly 'cmdline.txt' was generated this way, I wonder whether it'll be okay for other platforms. -Dasn |
From: Alecs K. <al...@pe...> - 2005-11-20 06:38:03
|
On Sat, Nov 19, 2005 at 05:23:34AM +0800, Dasn wrote: > Hi, list! > I just found this thread via Web and was sorry for coming back that > late. Welcome back. > 'fileencoding' option is null. The most tricky stuff comes when you edit > the files with "notepad.exe", all things look fine in Notepad, then I > "save as..." the file in UTF-8 format. Open it again with gvim, now, Vim > can recognize the file encoding and auto convert to cp936. The files I > had commited were formated like this. Yeah, this indeed causes the problem. utf8->cp936->utf8. As we have discussed in this thread, cp936 is in fact gbk, which sometimes is not friendly to our enc-cn (aka gb2312) users when your file contains a gbk-only character that does not exist in the gb2312 encoding. We want our vimcdoc as much encoding-independent as possible. Users can use euc-cn/gb2312, gbk/cp936, utf8 and they dont have to know about the underlying conversion that Vim itself has done when displaying the help file. We make this transparent to users and they dont have to bother doing any conversion themselves. They just :help and see. This means we translators have to 1) edit our file utf-8 native. we dont manually do any conversion whatsoever. 2) make sure the utf8-encoded file _can_ be successfully converted to gb2312, since gb2312 is the very subset. > Recently, I have to use "fileencodings=utf8" in _vimrc to force Vim to > treat all kinds of files as UTF-8 encoding. I think fileencodings=utf8 is the rite way to go. FYI, it does not treat _all_ files as utf-8. It just treats utf8 (which includes ascii) files as utf8. Well, this only ensures the condition 1) above. If you write some 'evil' characters, the file still cannot be converted to gb2312. I wonder if there are similar tools on Windoze like this: $ iconv -f utf-8 -t euc-cn file > /dev/null which can help you check the validity. > The newly 'cmdline.txt' was > generated this way, I wonder whether it'll be okay for other platforms. Checked, ok. -- Alecs King |
From: James He <ic...@gm...> - 2005-11-20 09:27:06
|
On 11/20/05, Alecs King <al...@pe...> wrote: > This means we translators have to > 1) edit our file utf-8 native. we dont manually do any conversion > whatsoever. > 2) make sure the utf8-encoded file _can_ be successfully converted to > gb2312, since gb2312 is the very subset. > > I think fileencodings=3Dutf8 is the rite way to go. FYI, it does not > treat _all_ files as utf-8. It just treats utf8 (which includes ascii) > files as utf8. > > Well, this only ensures the condition 1) above. If you write some 'evil' > characters, the file still cannot be converted to gb2312. > > I wonder if there are similar tools on Windoze like this: > $ iconv -f utf-8 -t euc-cn file > /dev/null > which can help you check the validity. Should we add this guide line to data/guides.txt? :-) -- Best regards, James He |
From: Alecs K. <al...@gm...> - 2005-11-22 07:35:11
|
On Sun, Nov 20, 2005 at 04:26:49AM -0500, James He wrote: > [snip] > Should we add this guide line to data/guides.txt? :-) Hmm. They are mainly already put in the guides.txt (Comp. Rule 8). I think many ppl have their own custom enc/fenc/fencs settings. :set fencs=utf8 almost hurts no one but better leave that to ppl themselves. -- Alecs King |