From: Alecs K. <al...@pe...> - 2005-06-27 12:17:47
|
> > FYI, there's a known glitch: some files worked by dasn (ie. > > if_ole.txt, > > gui_w32.txt, os_msdos.txt, etc.) couldnt be properly viewed under the > > GBK locale. I dunno how to fix it so far. It might have something > > to do with the (incorrect/corrupted) utf-8 conversion. > hmmm. Can you provide a full list of files? What is the symptom? Can > you read them at all? gui_w16.txt, gui_w32.txt, if_ole.txt, intro.txt, map.txt, os_dos.txt, os_msdos.txt, os_win32.txt, sponsor.txt, uganda.txt, windows.txt ie. :h intro@cn (under GBK locale, aka in vim, enc=euc-cn) shows nothing but malformed characters. though ok when euc=utf-8. other files like usr_*.txt dont have this problem. -- Alecs King |
From: Wenzhi L. <wen...@gm...> - 2005-06-28 08:38:25
|
wandys, Tested on Slackware last night and the installation went fine (as root). I = did see the problem you had tough, and on more files. Here is my list: usr_02.txt sponsor.txt uganda.txt usr_10.txt intro.txt pattern.txt map.txt windows.txt mbyte.txt gui_w16.txt gui_w32.txt if_ole.txt os_dos.txt os_msdos.txt os_win32.txt I think the actual encoding of the files are (probably) correct. We just need to help Vim to detect it. Will look into it but it will take time. For everyone else, if you are still on the lis and listening, please try the tarball/exe file yourself and play around with it. The more problem we found the better. Thanks, lang2 On 6/27/05, Alecs King <al...@pe...> wrote: > > > FYI, there's a known glitch: some files worked by dasn (ie. > > > if_ole.txt, > > > gui_w32.txt, os_msdos.txt, etc.) couldnt be properly viewed under the > > > GBK locale. I dunno how to fix it so far. It might have something > > > to do with the (incorrect/corrupted) utf-8 conversion. > > hmmm. Can you provide a full list of files? What is the symptom? Can > > you read them at all? >=20 > gui_w16.txt, gui_w32.txt, if_ole.txt, intro.txt, map.txt, > os_dos.txt, os_msdos.txt, os_win32.txt, sponsor.txt, > uganda.txt, windows.txt >=20 >=20 > ie. >=20 > :h intro@cn (under GBK locale, aka in vim, enc=3Deuc-cn) shows nothing > but malformed characters. >=20 > though ok when euc=3Dutf-8. >=20 >=20 > other files like usr_*.txt dont have this problem. >=20 > -- > Alecs King >=20 >=20 > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=3D7477&alloc_id=3D16492&op=3Dclic= k > _______________________________________________ > Vimcdoc-translate mailing list > Vim...@li... > https://lists.sourceforge.net/lists/listinfo/vimcdoc-translate > |
From: Alecs K. <al...@pe...> - 2005-06-28 22:49:34
|
On Tue, Jun 28, 2005 at 09:38:21AM +0100, Wenzhi Liang wrote: > Tested on Slackware last night and the installation went fine (as root). I did > see the problem you had tough, and on more files. Here is my list: > usr_02.txt > sponsor.txt > uganda.txt > usr_10.txt > intro.txt > pattern.txt > map.txt > windows.txt > mbyte.txt > gui_w16.txt > gui_w32.txt > if_ole.txt > os_dos.txt > os_msdos.txt > os_win32.txt > I think the actual encoding of the files are (probably) correct. We just > need to help Vim to detect it. Will look into it but it will take time. After playing around with gdb & vim, i just find the cause is indeed of the files themselves. > > ie. > > > > :h intro@cn (under GBK locale, aka in vim, enc=euc-cn) shows nothing > > but malformed characters. I _was_ kinda misleading here. The truth is that, euc is not gbk but gb2312. Some of our docs contain some 'evil' characters that have been converted from gbk. But these characters are gbk-only [1] and cannot be successfully converted to enc-cn (aka gb2312). Hence the problem arises. I followed the execution of vim and replaced all those 'evil' characters with enc-cn friendly ones. Most of them are gbk punctuation, some of them are traditional Chinese characters, others are unknown invalid chars. All changes committed to CVS. I'm about to release a 0.8.0-rc1 which you can test to see if this problem still remains and/or there are any other problems. When doing translation, remember to run the following command $ iconv -f utf-8 -t euc-cn file.txt >/dev/null to check if your doc is enc-cn friendly. [1] One exception is pattern.txt, of which the 'evil' chars are not from gbk conversion but from the original English doc. They are not evil of themselves but are evil to enc-cn. -- Alecs King |
From: Wenzhi L. <wen...@gm...> - 2005-06-29 08:39:06
|
Excellent! Thanks wandys, On 6/28/05, Alecs King <al...@pe...> wrote: >=20 > > > ie. > > > > > > :h intro@cn (under GBK locale, aka in vim, enc=3Deuc-cn) shows nothin= g > > > but malformed characters. >=20 > I _was_ kinda misleading here. The truth is that, euc is not gbk but > gb2312. Some of our docs contain some 'evil' characters that have been > converted from gbk. But these characters are gbk-only [1] and cannot be > successfully converted to enc-cn (aka gb2312). Hence the problem arises. That explains why on Slackware if I set LANG to zh_CN.gbk, things are OK. I= s=20 that a limitation of Vim though? >=20 > All changes committed to CVS. I'm about to release a 0.8.0-rc1 which > you can test to see if this problem still remains and/or there are any > other problems. Will test the new tarball. >=20 > When doing translation, remember to run the following command >=20 > $ iconv -f utf-8 -t euc-cn file.txt >/dev/null >=20 > to check if your doc is enc-cn friendly. Can you add this to the guide.txt? Thanks Greate job! lang2 |
From: Alecs K. <al...@pe...> - 2005-06-29 16:14:47
|
On Wed, Jun 29, 2005 at 09:38:56AM +0100, Wenzhi Liang wrote: > > I _was_ kinda misleading here. The truth is that, euc is not gbk but > > gb2312. Some of our docs contain some 'evil' characters that have been > > converted from gbk. But these characters are gbk-only [1] and cannot be > > successfully converted to enc-cn (aka gb2312). Hence the problem arises. > > That explains why on Slackware if I set LANG to zh_CN.gbk, things are OK. Is > that a limitation of Vim though? Kind of. Vim doesnt have a 'gbk' for 'enc' option now. Before that and before everyone's using utf-8, we are better off keeping our files enc-cn friendly. > > When doing translation, remember to run the following command > > > > $ iconv -f utf-8 -t euc-cn file.txt >/dev/null > > > > to check if your doc is enc-cn friendly. > Can you add this to the guide.txt? Thanks Will do. -- Alecs King |
From: Carlos Z.F. L. <car...@us...> - 2005-06-30 15:30:35
|
On Wed, Jun 29, 2005 at 09:38:56AM +0100, Wenzhi Liang wrote: > Excellent! Thanks wandys, > > On 6/28/05, Alecs King <al...@pe...> wrote: > > > > > > ie. > > > > > > > > :h intro@cn (under GBK locale, aka in vim, enc=euc-cn) shows nothing > > > > but malformed characters. > > > > I _was_ kinda misleading here. The truth is that, euc is not gbk but > > gb2312. Some of our docs contain some 'evil' characters that have been > > converted from gbk. But these characters are gbk-only [1] and cannot be > > successfully converted to enc-cn (aka gb2312). Hence the problem arises. > > That explains why on Slackware if I set LANG to zh_CN.gbk, things are OK. Is > that a limitation of Vim though? > In GBK environment, users can get GBK support for vim by "enc=cp936". It works in both *nix and windows. But if you search on internet, most documents and tutorials suggest only "euc-cn". -- Best Regards, Carlos |
From: Alecs K. <al...@pe...> - 2005-06-30 17:33:00
|
On Fri, Jul 01, 2005 at 03:30:26AM +1200, Carlos Z.F. Liu wrote: > In GBK environment, users can get GBK support for vim by "enc=cp936". > It works in both *nix and windows. But if you search on internet, most > documents and tutorials suggest only "euc-cn". At least in mbyte.txt: >2 cp936 simplified Chinese (Windows only) >2 euc-cn simplified Chinese (Unix only) >... >2 2byte-{name} Unix: any double-byte encoding (Vim specific name) So i think the rite/safe/official way to use GBK on Unix is to set enc=2byte-gbk. Vim will strip the leading '2byte-' and then pass 'gbk' to iconv_open(). Tested. Worked fine. Anyway, too many ppl are using euc-cn, so it's better to always keep our docs enc-cn friendly. -- Alecs King |