From: Wenzhi L. <wen...@gm...> - 2006-03-07 23:31:44
|
Hello all, I am having some problem with the new files Willis checked in. When I do 'iconv -f utf-8 -t euc-cn' on some of them, it failed. I fixed one file but= the other files seems to include the so-called 'digraph' like '=E1'. iconv coul= dn't convert them. wandys, I remember we had this problem before and you fixed it. If you are still around, can you take a look at this? There is a work around for this: add 'enc=3Dutf-8' to the status line. But = it is not recommended. Any input appreciated. Oh. The list of problomatic files: usr_24 index.txt various.txt pattern.txt digraph mbyte.txt arabic.txt farsi.txt hebrew.txt pi_netrw.txt thanks , lang2 |
From: <thu...@gm...> - 2006-03-08 00:26:52
|
There are names like "J=E9r=F4me Aug=E9". I think there is no way except to convert these names into two-byte character by hand. Or we have to accept utf8 encoding. We have to show our respect to these people, don't we? 2006/3/8, Wenzhi Liang <wen...@gm...>: > Hello all, > > I am having some problem with the new files Willis checked in. When I do > 'iconv -f utf-8 -t euc-cn' on some of them, it failed. I fixed one file b= ut the > other files seems to include the so-called 'digraph' like '=E1'. iconv co= uldn't > convert them. > > wandys, I remember we had this problem before and you fixed it. If you > are still around, can you take a look at this? > > There is a work around for this: add 'enc=3Dutf-8' to the status line. Bu= t it > is not recommended. > > Any input appreciated. > > Oh. The list of problomatic files: > usr_24 > index.txt > various.txt > pattern.txt > digraph > mbyte.txt > arabic.txt > farsi.txt > hebrew.txt > pi_netrw.txt > > > thanks , > > > lang2 > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting langua= ge > that extends applications into web and mobile media. Attend the live webc= ast > and join the prime developer group breaking into this new coding territor= y! > http://sel.as-us.falkag.net/sel?cmdlnk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Vimcdoc-translate mailing list > Vim...@li... > https://lists.sourceforge.net/lists/listinfo/vimcdoc-translate > -- Wang Honglei thu...@gm... http://www.cumt.org/ |
From: <thu...@gm...> - 2006-03-08 01:49:31
|
VW5mb3J0dW5hdGVseSwgaXQgc2VlbXMgdGhhdCB0aGVyZSBpcyBubyAiw7QiIGluIEdCMjMxMiBl bmNvZGluZy4gVGhlcmUKYXJlICLDqiIsICLDqSIgYW5kICLHkiIgLCB0aG91Z2guCgp0aHVuZGVy dwoKMjAwNi8zLzgsIOeOi+a0qumbtyA8dGh1bmRlcndAZ21haWwuY29tPjoKPiBUaGVyZSBhcmUg bmFtZXMgbGlrZSAiSsOpcsO0bWUgQXVnw6kiLgo+IEkgdGhpbmsgdGhlcmUgaXMgbm8gd2F5IGV4 Y2VwdCB0byBjb252ZXJ0IHRoZXNlIG5hbWVzIGludG8gdHdvLWJ5dGUKPiBjaGFyYWN0ZXIgYnkg aGFuZC4KPiBPciB3ZSBoYXZlIHRvIGFjY2VwdCB1dGY4IGVuY29kaW5nLgo+IFdlIGhhdmUgdG8g c2hvdyBvdXIgcmVzcGVjdCB0byB0aGVzZSBwZW9wbGUsIGRvbid0IHdlPwo+Cj4gMjAwNi8zLzgs IFdlbnpoaSBMaWFuZyA8d2VuemhpLmxpYW5nQGdtYWlsLmNvbT46Cj4gPiBIZWxsbyBhbGwsCj4g Pgo+ID4gSSBhbSBoYXZpbmcgc29tZSBwcm9ibGVtIHdpdGggdGhlIG5ldyBmaWxlcyBXaWxsaXMg Y2hlY2tlZCBpbi4gV2hlbiBJIGRvCj4gPiAnaWNvbnYgLWYgdXRmLTggLXQgZXVjLWNuJyBvbiBz b21lIG9mIHRoZW0sIGl0IGZhaWxlZC4gSSBmaXhlZCBvbmUgZmlsZSBidXQgdGhlCj4gPiBvdGhl ciBmaWxlcyBzZWVtcyB0byBpbmNsdWRlIHRoZSBzby1jYWxsZWQgJ2RpZ3JhcGgnIGxpa2UgJ8Oh Jy4gaWNvbnYgY291bGRuJ3QKPiA+IGNvbnZlcnQgdGhlbS4KPiA+Cj4gPiB3YW5keXMsICBJIHJl bWVtYmVyIHdlIGhhZCB0aGlzIHByb2JsZW0gYmVmb3JlIGFuZCB5b3UgZml4ZWQgaXQuIElmIHlv dQo+ID4gYXJlIHN0aWxsIGFyb3VuZCwgY2FuIHlvdSB0YWtlIGEgbG9vayBhdCB0aGlzPwo+ID4K PiA+IFRoZXJlIGlzIGEgd29yayBhcm91bmQgZm9yIHRoaXM6IGFkZCAnZW5jPXV0Zi04JyB0byB0 aGUgc3RhdHVzIGxpbmUuIEJ1dCBpdAo+ID4gaXMgbm90IHJlY29tbWVuZGVkLgo+ID4KPiA+IEFu eSBpbnB1dCBhcHByZWNpYXRlZC4KPiA+Cj4gPiBPaC4gVGhlIGxpc3Qgb2YgcHJvYmxvbWF0aWMg ZmlsZXM6Cj4gPiB1c3JfMjQKPiA+IGluZGV4LnR4dAo+ID4gdmFyaW91cy50eHQKPiA+IHBhdHRl cm4udHh0Cj4gPiBkaWdyYXBoCj4gPiBtYnl0ZS50eHQKPiA+IGFyYWJpYy50eHQKPiA+IGZhcnNp LnR4dAo+ID4gaGVicmV3LnR4dAo+ID4gcGlfbmV0cncudHh0Cj4gPgo+ID4KPiA+IHRoYW5rcyAs Cj4gPgo+ID4KPiA+IGxhbmcyCj4gPgo+ID4KPiA+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KPiA+IFRoaXMgU0YuTmV0IGVtYWlsIGlzIHNw b25zb3JlZCBieSB4UE1MLCBhIGdyb3VuZGJyZWFraW5nIHNjcmlwdGluZyBsYW5ndWFnZQo+ID4g dGhhdCBleHRlbmRzIGFwcGxpY2F0aW9ucyBpbnRvIHdlYiBhbmQgbW9iaWxlIG1lZGlhLiBBdHRl bmQgdGhlIGxpdmUgd2ViY2FzdAo+ID4gYW5kIGpvaW4gdGhlIHByaW1lIGRldmVsb3BlciBncm91 cCBicmVha2luZyBpbnRvIHRoaXMgbmV3IGNvZGluZyB0ZXJyaXRvcnkhCj4gPiBodHRwOi8vc2Vs LmFzLXVzLmZhbGthZy5uZXQvc2VsP2NtZGxuayZraWQRMDk0NCZiaWQkMTcyMCZkYXQSMTY0Mgo+ ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPiA+IFZp bWNkb2MtdHJhbnNsYXRlIG1haWxpbmcgbGlzdAo+ID4gVmltY2RvYy10cmFuc2xhdGVAbGlzdHMu c291cmNlZm9yZ2UubmV0Cj4gPiBodHRwczovL2xpc3RzLnNvdXJjZWZvcmdlLm5ldC9saXN0cy9s aXN0aW5mby92aW1jZG9jLXRyYW5zbGF0ZQo+ID4KPgo+Cj4gLS0KPiBXYW5nIEhvbmdsZWkKPiB0 aHVuZGVyd0BnbWFpbC5jb20KPiBodHRwOi8vd3d3LmN1bXQub3JnLwo+CgoKLS0KV2FuZyBIb25n bGVpCnRodW5kZXJ3QGdtYWlsLmNvbQpodHRwOi8vd3d3LmN1bXQub3JnLwo= |
From: Alecs K. <al...@pe...> - 2006-03-08 10:01:12
|
On Tue, Mar 07, 2006 at 11:31:40PM +0000, Wenzhi Liang wrote: > wandys, I remember we had this problem before and you fixed it. If you > are still around, can you take a look at this? The problem i 'fixed' falls in the case 2 & 3 as Willis stated. (welcome aboard. huge job you did. thanks). There's no easy and elegant fix for case 1. It may be not very polite to have people's names misspelled as Thunder said in another mail. > There is a work around for this: add 'enc=utf-8' to the status line. This doesnt affect utf-8 users since they already set to use utf8 in the first place. And this does no good for a regular enc-cn user since s/he normally cannot view utf-8 in his/her enc-cn environment (ie. in a enc-cn locale'd terminal). So there are several options for us: 1) We dont care about euc-cn any more. utf-8 only, simple and clear. 2) We care about euc-cn by sacrificing utf-8 quality (change/remove questionable chars). 3) With utf-8 versions being good and untouched, we start maintaining and providing separate euc-cn versions (iconv -c). 4) Patch Vim to make it act like iconv -c when it's doing conversion. 5) ??? Nooooone is perfect. -- Alecs King |
From: Wenzhi L. <wen...@gm...> - 2006-03-08 23:01:32
|
Hi wandys, Long time no see. :-) On 08/03/06, Alecs King <al...@pe...> wrote: > On Tue, Mar 07, 2006 at 11:31:40PM +0000, Wenzhi Liang wrote: > > wandys, I remember we had this problem before and you fixed it. If you > > are still around, can you take a look at this? > > The problem i 'fixed' falls in the case 2 & 3 as Willis stated. (welcome > aboard. huge job you did. thanks). > > There's no easy and elegant fix for case 1. It may be not very polite > to have people's names misspelled as Thunder said in another mail. I agree with thunder too. > > > There is a work around for this: add 'enc=3Dutf-8' to the status line. > > This doesnt affect utf-8 users since they already set to use utf8 in the > first place. And this does no good for a regular enc-cn user since s/he > normally cannot view utf-8 in his/her enc-cn environment (ie. in a > enc-cn locale'd terminal). Well that's not entirely true. When I said it was a work around, I tested w= ith the GUI version (don't use vim on console that often). And you are right a= bout the console side. However, if the console cannot display utf8 character at all, it is not a problem of the translation then. So the work around is sti= ll a work around. > > > So there are several options for us: > > 1) We dont care about euc-cn any more. utf-8 only, simple and clear. hmmm. How do we achieve that? I tried to set LANG=3Dzh_CN.UTF-8 and vim still can't display usr_24.txt > > 2) We care about euc-cn by sacrificing utf-8 quality (change/remove > questionable chars). Don't think that's a good idea. > > 3) With utf-8 versions being good and untouched, we start maintaining > and providing separate euc-cn versions (iconv -c). I think 'iconv -c' simply drop the character that it has problem with. Righ= t? Then is this not the same as 2? > > 4) Patch Vim to make it act like iconv -c when it's doing conversion. Don't think Vim's maintainer would be happy with this. Same reason as 3. > > Nooooone is perfect. That seems true. If I understand correctly, it is because iconv doesn't sup= port gbk? lang2 |
From: Alecs K. <al...@pe...> - 2006-03-09 04:31:11
|
On Wed, Mar 08, 2006 at 11:01:26PM +0000, Wenzhi Liang wrote: > So the work around is still a work around. For GUI Vims in the enc-cn locale (or :enc) that still can show utf-8, yes. But i may suggest they use utf-8 in the first place. > > So there are several options for us: These options are all (bad) workarounds too. > > 1) We dont care about euc-cn any more. utf-8 only, simple and clear. Which means all the problems are NOT problems any more since there's no euc-cn concerned. We only support utf-8 and people must use utf-8 to view our docs. This of course will piss off enc-cn users. > hmmm. How do we achieve that? I tried to set LANG=zh_CN.UTF-8 and > vim still can't display usr_24.txt Thats weird. utf-8 files cant be viewed in the utf-8 environment? Your :enc is? I tested gvim with :enc=utf-8. No problems at all. > > 2) We care about euc-cn by sacrificing utf-8 quality (change/remove > > questionable chars). > > Don't think that's a good idea. Ack. > > 3) With utf-8 versions being good and untouched, we start maintaining ^^^^^^^^^ > > and providing separate euc-cn versions (iconv -c). ^^^^^^^^ > I think 'iconv -c' simply drop the character that it has problem with. We can change it to something else like ?? > Then is this not the same as 2? No. The utf-8 version remains good and utf-8 users are happy as ever. And for enc-cn users, we provide a compromised version to avoid junk files. This is IMHO by far the most possible workaround if we still consider enc-cn people. We can detect or let user specify locale/:enc in our install script. Or, we can directly provide two tarballs to let users choose to download and install whichever suits them. > > 4) Patch Vim to make it act like iconv -c when it's doing conversion. > > Don't think Vim's maintainer would be happy with this. Same reason as 3. Pretty hard. Even if he does accept it, though in little chance, there might be a long time till people use that patched version, if any. > If I understand correctly, it is because iconv doesn't support gbk? No, it's not. iconv has no problem and it does its job _pretty_ well in its own right. The problem is because of the gbk/gb2312 encodings themselves: they just dont have those foreign characters and thus the chars cant be converted to and correctly displayed in gb locale/:enc. utf-8 rox. Ken Thompson kicks ass. But we live in a messy world. -- Alecs King |
From: Wenzhi L. <wen...@gm...> - 2006-03-09 23:40:24
|
wandys, > > > hmmm. How do we achieve that? I tried to set LANG=3Dzh_CN.UTF-8 and > > vim still can't display usr_24.txt > > Thats weird. utf-8 files cant be viewed in the utf-8 environment? Your > :enc is? I tested gvim with :enc=3Dutf-8. No problems at all. It turns out that (on my sytem at least), vim ignores the $LANG env var and set the 'enc' option based on $LC_ALL. So if I have $LC_ALL=3Dzh_CN.UTF-8 i= t is OK. So there are two ways to do this: 1) in the shell, set LC_ALL=3Dzh_CN.UTF-8, which might not be ideal for som= e user, me included. 2) in vim, set 'enc' to utf-8. This actually sounds OK to me. The euc-cn users will actually not miss out big because it is only a small portion of the whole translation that won't = be viewed easily. If we can fix two more (as Willis suggested?), it looks like= a good compromise. But I think this is easier for the GUI user and not so much for the console user? What do you think wandys? > > Then is this not the same as 2? > > No. The utf-8 version remains good and utf-8 users are happy as ever. > And for enc-cn users, we provide a compromised version to avoid junk > files. This is IMHO by far the most possible workaround if we still > consider enc-cn people. We can detect or let user specify locale/:enc > in our install script. Or, we can directly provide two tarballs to let > users choose to download and install whichever suits them. This sounds OK too and it would definitely please both crowds. But it doens= 't please me. :-). It just feels strange to have two releases for one thing. B= ut then again, who am I asking to be pleased? Pondering... So right now, I'd lean towards an all utf solution. Any more inputs? lang2 |
From: Yi-an H. <yia...@gm...> - 2006-03-10 01:07:39
|
I think there is a "nearly" perfect solution. Use iconv with gb18030 as the "to" encoding works exactly as we want. You don't need the -c flag. There will be no error, all questionable characters are marked as ??. What is more, it is in fact "lossless". When you convert it back to utf-8, it is identical to the original file. I say it is "nearly" perfect because you still cannot read them correctly i= n gb encodings. But one can keep only copies from one encoding and convert it "losslessly" to the other encoding, if necessary. I have not tested under all environments. Your environment may not support gb18030. But maybe we can try. On 3/9/06, Wenzhi Liang <wen...@gm...> wrote: > > wandys, > > > > > > hmmm. How do we achieve that? I tried to set LANG=3Dzh_CN.UTF-8 and > > > vim still can't display usr_24.txt > > > > Thats weird. utf-8 files cant be viewed in the utf-8 environment? You= r > > :enc is? I tested gvim with :enc=3Dutf-8. No problems at all. > > It turns out that (on my sytem at least), vim ignores the $LANG env var > and > set the 'enc' option based on $LC_ALL. So if I have $LC_ALL=3Dzh_CN.UTF-8= it > is OK. So there are two ways to do this: > 1) in the shell, set LC_ALL=3Dzh_CN.UTF-8, which might not be ideal for s= ome > user, > me included. > 2) in vim, set 'enc' to utf-8. > > This actually sounds OK to me. The euc-cn users will actually not miss ou= t > big because it is only a small portion of the whole translation that won'= t > be > viewed easily. If we can fix two more (as Willis suggested?), it looks > like a > good compromise. > > But I think this is easier for the GUI user and not so much for the > console > user? What do you think wandys? > > > > Then is this not the same as 2? > > > > No. The utf-8 version remains good and utf-8 users are happy as ever. > > And for enc-cn users, we provide a compromised version to avoid junk > > files. This is IMHO by far the most possible workaround if we still > > consider enc-cn people. We can detect or let user specify locale/:enc > > in our install script. Or, we can directly provide two tarballs to let > > users choose to download and install whichever suits them. > > This sounds OK too and it would definitely please both crowds. But it > doens't > please me. :-). It just feels strange to have two releases for one thing. > But > then again, who am I asking to be pleased? Pondering... > > So right now, I'd lean towards an all utf solution. > > Any more inputs? > > lang2 > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmdlnk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Vimcdoc-translate mailing list > Vim...@li... > https://lists.sourceforge.net/lists/listinfo/vimcdoc-translate > |
From: Alecs K. <al...@pe...> - 2006-03-10 02:37:32
|
On Thu, Mar 09, 2006 at 11:40:22PM +0000, Wenzhi Liang wrote: > It turns out that (on my sytem at least), vim ignores the $LANG env var and > set the 'enc' option based on $LC_ALL. So if I have $LC_ALL=zh_CN.UTF-8 it > is OK. So there are two ways to do this: FYI, here on my box, Vim does honor the $LANG env var. > But I think this is easier for the GUI user and not so much for the console > user? What do you think wandys? Yes, terminal euc-cn users would frown at it but i'm okay with that. > This sounds OK too and it would definitely please both crowds. But it doens't > please me. :-). It just feels strange to have two releases for one thing. But > then again, who am I asking to be pleased? Pondering... I dunno about the windoze/DOS side but for Linux/*BSDs, we can do the conversion on-the-fly in our install script and this is transparent to users. But heck, i doubt it's worth all the effort. > So right now, I'd lean towards an all utf solution. Ack. One of my favorite quotes is: What the standard is does not matter (in fact all standards suck), as long as we _have a standard_ to follow. -- Anonymous So hope we can settle this enc-cn thing ASAP and get to improve our docs. It's good to see our ML gets noisy again and welcome new members as well as old friends to 'Go Wild'. Will do some review later on if time permits. Thanks & Regards, -- Alecs King |
From: Wenzhi L. <wen...@gm...> - 2006-03-10 10:29:43
|
On 10/03/06, Alecs King <al...@pe...> wrote: > On Thu, Mar 09, 2006 at 11:40:22PM +0000, Wenzhi Liang wrote: > > It turns out that (on my sytem at least), vim ignores the $LANG env var= and > > set the 'enc' option based on $LC_ALL. So if I have $LC_ALL=3Dzh_CN.UTF= -8 it > > is OK. So there are two ways to do this: > > FYI, here on my box, Vim does honor the $LANG env var. I tried on another box and the gvim still ignores $LANG. Can you check your LC_ALL? We need to be careful what to say here. > > So right now, I'd lean towards an all utf solution. > > Ack. OK. This is what we are going to do then. lang2 |