[mined-editor] Re: uterm script
Brought to you by:
thomaswolff
|
From: Mike F. <mf...@su...> - 2005-09-28 10:10:28
|
Thomas Wolff <mi...@to...> さんは書きました:
>> For a long time already, xterm works just fine in UTF-8 without
>> setting any special options.
> Not quite, as still today many users are not well-advised how to
> easily configure UTF-8.
On SuSE Linux, UTF-8 is the default since SuSE Linux 9.1, i.e. more
than a year now. RedHat switched to UTF-8 as the default about two
years earlier.
I.e. on recent SuSE or RedHat/Fedora systems, nothing needs to be
configured, calling just "xterm" is enough.
>> Same with LESSCHARSET. LESSCHARSET shouldn't be set either, less also
>> detects this automatically from the environment.
> See above; for the benefit of supporting older systems, I think it's
> a good idea to maintain such compatibility settings for a couple of years
> still, especially as they do no harm.
In my experience they do harm. I often had reports from users who had
LESSCHARSET, LANG, or some LC_* variables changed by some overly
helpful script and ran into problems because of this.
For example, some script sets LESSCHARSET=utf-8. The user doesn't
notice, but at first there is no problem, everything still works
because the default was UTF-8 anyway. Now the user wants to look
at one of his old ISO-8859-15 files and for that purpose starts
a new xterm or uses luit:
LANG=de_DE@euro xterm
or
LANG=de_DE@euro luit
But somehow it doesn't work because LESSCHARSET=utf-8 is still in the
environment. The user is very confused because he didn't think about
LESSCHARSET at all and reports a bug.
Weird problems can also occur when the settings of LANG, and the LC_*
variables are inconsistent. For example something like
export LANG=de_DE@euro
export LC_CTYPE=de_DE.UTF-8
(and all other LC_* variables unset) is not allowed. I often had
reports from users who did run into problems because of this. And
often they didn't set these illegal combinations manually on their
own, rather they were caused by such scripts trying to be helpful by
fiddling with the LC_* variables but unfortunately didn't get it
right.
On modern systems, it is more difficult to get a non-UTF-8
terminal. UTF-8 is the default anyway, only when the user does *not*
want to use UTF-8 temporarily, special setup is needed.
> Also, in auto-detection mode, xterm obviously involves "luit" for
> internal mediation of character encoding, and this even if it
> then runs in UTF-8 mode!
No, xterm does never call luit in UTF-8 locales. See the following
extract from the xterm man-page:
man xterm> locale (class Locale)
man xterm> Specifies how to use luit, an encoding converter between UTF-8
man xterm> and locale encodings. The resource value (ignoring case) may
man xterm> be:
man xterm>
man xterm> true
man xterm> xterm will use the encoding specified by the users'
man xterm> LC_CTYPE locale (i.e., LC_ALL, LC_CTYPE, or LANG variables)
man xterm> as far as possible. This is realized by always enabling
man xterm> UTF-8 mode and invoking luit in non-UTF-8 locales.
man xterm>
man xterm> medium
man xterm> xterm will follow users' LC_CTYPE locale only for UTF-8,
man xterm> east Asian, and Thai locales, where the encodings were not
man xterm> supported by conventional 8bit mode with changing fonts.
man xterm> For other locales, xterm will use conventional 8bit mode.
man xterm>
man xterm> checkfont
man xterm> If mini-luit is compiled-in, xterm will check if a Unicode
man xterm> font has been specified. If so, it checks if the character
man xterm> encoding for the current locale is POSIX, Latin-1 or
man xterm> Latin-9, uses the appropriate mapping to support those with
man xterm> the Unicode font. For other encodings, xterm assumes that
man xterm> UTF-8 encoding is required.
man xterm>
man xterm> false
man xterm> xterm will use conventional 8bit mode or UTF-8 mode accord‐
man xterm> ing to utf8 resource or -u8 option.
man xterm>
man xterm> Any other value, e.g., ``UTF-8'' or ``ISO8859-2'', is assumed
man xterm> to be an encoding name; luit will be invoked to support the
man xterm> encoding. The actual list of supported encodings depends on
man xterm> luit. The default is ``medium''.
man xterm>
man xterm> Regardless of your locale and encoding, you need an ISO-10646-1
man xterm> font to display the result. Your configuration may not include
man xterm> this font, or locale-support by xterm may not be needed. At
man xterm> startup, xterm uses a mechanism equivalent to the load-vt-
man xterm> fonts(utf8Fonts, Utf8Fonts) action to load font name subre‐
man xterm> sources of the VT100 widget. That is, resource patterns such
man xterm> as "*vt100.utf8Fonts.font" will be loaded, and (if this
man xterm> resource is enabled), override the normal fonts. If no subre‐
man xterm> sources are found, the normal fonts such as "*vt100.font",
man xterm> etc., are used. The resource files distributed with xterm use
man xterm> ISO-10646-1 fonts, but do not rely on them unless you are using
man xterm> the locale mechanism.
> This can unfortunately hang xterm (if combined with option -e) as
> one mined user reported.
Yes indeed, luit sometimes hang, see
http://bugzilla.novell.com/show_bug.cgi?id=117193
> I found that it is necessary to add the X resource
> UXTerm*locale:false to the xterm invocation to avoid this problem.
No, also with "true", "medium", and "checkfont", xterm will not run
luit when in UTF-8 mode. It may run luit when *not* in UTF-8 mode,
depending on the setting of the "XTerm*locale" X resource.
> Considering the arguments so far, I think it is clearly useful to have
> a "uterm" script just like the "uxterm" script that was later introduced
> into the xterm distribution. Having the latter, "uterm" didn't have
> additional value, however.
uxterm as well caused more problems then it solved in my opinion.
> * The best Unicode terminal font in my opinion is the 10x20 font,
> which is much more legible than the spindly 9x18 font, and the
> smaller fonts are not suitable at all for a number of scripts.
> Unfortunately, the Unicode X fonts distribution does not include
> matching 20x20 CJK fonts, and xterm cannot handle single-width
> and double-with fonts that do not exactly match in size (like rxvt
> can do!).
mlterm can pad fonts which don't match exactly as well.
> For that reason, I am providing a script that creates 20x20 CJK
> fonts from the 18x18 X fonts by padding all the glyphs.
> The uterm script checks if that font is installed and in this case
> invokes xterm with it.
Maybe I will add that font to the SuSE xterm package. But that is
only a temporary workaround, this should really be fixed in xterm,
see
http://bugzilla.novell.com/show_bug.cgi?id=49305
> I hope some people will find it useful to have this script available
> in /usr/bin.
It might be useful on legacy systems, but probably I should just omit
it in mined .rpm-packages for SuSE Linux >= 9.1 because it is not
really helpful there.
--
Mike FABIAN <mf...@su...> http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。
|