Re: [Mlterm-dev-en] mlterm, tibetan, and bitmap fonts?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Wed, 2006-03-08 at 03:48 -0500, Rich Felker wrote:
> On Tue, Mar 07, 2006 at 02:45:57AM +0900, MINAMI Hirokazu wrote:
> > > I finally succeeded in getting mlterm to display Tibetan on my laptop,
> > > but it's horribly misrendered. Here's a screenshot:
> > > 
> > > http://brightrain.aerifal.cx/~dalias/tibetan_misrender.png
> > 
> > You have to enable "variable column width" and Xft support
> > to render some character as zero-width.
> > Under the configuration, simple over-striking with vertical
> > shift can be performed.
> > 
> > What I saw was:
> > http://mistfall.net/minami/tmp/tibetan_w3m.png
> > 
> > which is not looks so bad to me.
> 
> Nope, it's horribly misrendered. All of the combining characters are
> displayed a whole character-cell to the left of where they belong. I
> don't see any characters that require a special form for combining,
> but I expect those would be wrong too.

Hmm, it shows how Xft renders Tibetan represented in UCS4.  
So 
 - All of the combining characters are displayed a whole
   character-cell to the left of where they belong.
may mean there's a bug in whether Xft or freetype or the Tibetan font
that glyph position is not correctly calculated. 

Since some of "a special form for combining" may not exist in Unicode,
it's not surprising if they cannot be displayed using UCS4.

Anyway, if you believe the result if far from the acceptable one,
we have to consider to use more advanced features of a font,
instead of just hacking Xft.

> > > I also got around to reading the mlterm source, and it seems that
> > > there's no effort made to position combining characters; they're just
> > > blindly displayed relative to the same origin as the base character
> > > with overstrike.
> > 
> > For combined character in Arabic, mlterm perform combining by
> > converting a group of non-combined UCS4 codes to a combined form.
> > Since they cannot be rendered simply vertically stacking each glyphs.
> 
> AFAIK these are not combining characters but simply ligatures.
> > The code and its conversion table is in ml_shape.c.
> > 
> > For Tibetan, however, the same approach cannot be used because
> > combined form of Tibetan glyphs are not registered in Unicode.
> 
> Combining and shaping are two completely different issues as far as
> unicode/ucs semantics go.

I'm not sure why we should distinguish liguatures/combining/shaping
here. We just have to convert a special sequence of UCS4 into
corresponding glyph in all cases.

Do you think combining characters need more complex processing?

However, the Unicode's semantics is not usable to combine Tibetan.
Because Unicode do not contain pre-combined glyphs for Tibetan,
we can't write a rule to map a (un-combined) Tibetan UCS4 sequence
into a already-combined one.

# And the same is true for some combined-form of Arabic glyphs.
# At present, mlterm cannot use such glyphs unless they are 
# accessible using Unicode.