Re: Display UTF-8 / non-ASCII characters in trn (2-line patch)
Status: Beta
Brought to you by:
wayned
|
From: John M. <lis...@b7...> - 2015-11-06 02:48:55
|
* Adam H. Kerman <ah...@ch...> [151104 17:59]: > This has nothing to do with trn but your terminal's locale settings > and things like the LANG environmental variable. trn passes characters > along to the display to be rendered by a display process. Generally, the > display won't have any idea what MIME headers are present (assuming MIME > correctly describes the character set used by the author which isn't > always the case), so you may need to change this from article to article > so the characters are translated as intended. That would be a hassle to have to fiddle with the LANG variable from one article to the next. With the patch I linked to everything comes through fine. Not sure what I'd have to set $LANG to to have utf-8, ISO-8859 etc. text show up properly... I have LANG=en_US.UTF-8 and without that patch non-ASCII text is not rendered properly. Reading from the post linked to in my previous email it seems the problem here is that trn strips all characters down to 7-bit: ... when trn checks for control characters in an article it assumes everything might be 7-bit with parity, so it strips off the top bit before checking! As a result it turns every extended byte between 0x80 and 0x9F into a control character which it then escapes! Hence the various garbage characters in any utf-8 or ISO-8859 article text. https://groups.google.com/d/msg/comp.sys.raspberry-pi/7Z37Hdrm0DM/6aqD-reXFzAJ > I don't always bother to change the locale when reading, but when > composing in followup, I need to see the non-ASCII characters. > Sometimes I mis-set the translation to make non-plain-text > characters visible. Composing would happen in EDITOR, yes? In my case that's Vim running in urxvt, where I've never had problems with the display of non-ASCII characters. Never had to fiddle with $LANG or any other environmental variable, just LANG=en_US.UTF-8 and all my console programs -- Mutt, Vim, ELinks... and now trn with that patch -- work fine. > I try to avoid quoting nonbreaking spaces, for instance; I wish > people would stop using such characters inappropriately. If at all > possible, I turn the quotes into ASCII if the non-ASCII characters > aren't needed, like substituting left open double quote with a good > olde ASCII ambiguous double quote. Agreed -- I use ASCII straight quotes when composing email & usenet posts. It's just that I don't want to see those @^Y jumbles when someone else uses curly quotes. But more importantly, when reading non-English texts I'd like to be able to see all non-ASCII characters properly rendered. The aforementioned patch is the only way I've found to accomplish this with trn. John -- John Magolske http://b79.net/contact |