Re: [Mlterm-dev-en] mlterm, tibetan, and bitmap fonts?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi. 

I've tried to see Tibetan text by following procedure.
Please let me know you have used some different way or
you are using another font. 

1. install "Tibetan Machine Uni" from www.thdl.org and register the font to 
fontconfig library. 

2. create ~/.mlterm/aafont and add:
ISO10646_UCS4_1=Tibetan Machine Uni-iso10646-1; 

3. run mlterm with anti-alias enabled:
mlterm -A 

4. display a Tibetan web page by w3m on mlterm:
w3m "http://www.thdl.org/xml/show.php?xml=test/tibetnew/thdlhp.xml&lng=tib" 

On Mon, 2005-12-12 at 08:16 -0500, Rich Felker wrote:
> first: my main motivation for wanting a full utf8 environment is
> Tibetan language support. i want to be able to read and write tibetan
> in email (mutt/emacs), irc/aim (irssi), etc. 
> 
> tibetan makes heavy use of combining characters -- up to 5 characters
> in one cell in ordinary colloquial words. i know that mlterm has
> support for combining characters (unlike the other useless "unicode"
> terminals i've found), but my understanding is that it overlays
> several glyphs to make the combined glyph. this is certainly usable,
> but only at rather large font sizes -- large enough that i wouldn't
> have very many columns on the screen. 
>
> what i'd like is to use a bitmap font with all possible stacks
> precombined and edited for readability. i've already played around
> with this quite a bit and made a fully readable tibetan font at 8x24
> pixels; however, the information i've found online (*nix/unicode faq,
> http://www.cl.cam.ac.uk/~mgk25/unicode.html) seems to indicate that
> the x font system has no decent way of supporting precombined bitmap
> glyphs when there is no precombined unicode character number. 
> 
> so my first question is: is there a way to do this already? or a sane
> way to implement it without introducing hacks that are way too evil?
> i'm willing to do really ugly hacks to support it if necessary (like
> ignoring the x font system and loading the bitmap glyphs directly in
> mlterm as a pixmap) but i'd be a lot more interested in doing it right
> if there's a sane way to do it instead, so that it could get included
> in the official mlterm source.

Judging from the result of
ftview 32 .fonts/TibetanMachineUniAlpha.ttf
, the "Tibetan Machine Uni" font seems to contain
many pre-combined glyphs. 

So it's a good news.
If these glyph covers all combination of Tibetan consonants,
you do not have to create another bitmap collection.
The freetype library should be able to generate
anti-alias aware images at whatever size you want. 

The bad news is, these glyphs do not have unicode code point and
mlterm can't use them at present since neither Xft nor "X core font"
works as desired in this case. 

# For Xft, it can create glyphs by over-striking corresponding glyphs.
However, the result looks tend to be ugly and do not seems to be correct
when more than two combining consonants are involved. 

So, it can be done in sane manner.
If we can make mlterm to use freetype library directly,
we can access glyph substitution table and pre-combined glyph. 

Alternatively, we may able to use m17n-lib
(http://www.m17n.org/m17n-lib/)
as a rendering engine. Though I'm not yet investigated deeply,
implementation of mlterm side can simpler in this case. 

Supporting to edit Tibetan on command line / editors like vim can be
very difficult. Because there's no standard for handling of combined
characters on a terminal, you have to define "the right way" at first.
ex.)
 - When you press Delete/Backspace key in a combined character,
Should we delete entire character or part of it?
 - When you press right/left key, should a cursor move one consonant or
one character?
 - What should we do if part of combined character was overwritten? 

So we should consider a bit deeper after finishing to implement proper
glyph rendering scheme. 

> my second question is off-topic, but maybe someone here can answer: is
> screen's utf8 support usable? (including combining characters?) a
> quick rtfs seemed to indicate so but i don't have an environment for
> testing it at present. i'm quite addicted to screen, so if its utf8
> support is broken i need to do something about that first...

In my understanding, screen can pass utf-8 but do not care about
combined characters. So there should be some breakage.
I'm not using screen these days and not sure its current state, though. 

regards,
 --
minami <mi...@mi...>