Re: [Mlterm-dev-en] mlterm, tibetan, and bitmap fonts?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Mar 08, 2006 at 03:48:47AM -0500, Rich Felker wrote:
> On Tue, Mar 07, 2006 at 02:45:57AM +0900, MINAMI Hirokazu wrote:
> > The plan I had was:
> > 
> > By assuming specific font, the conversion table can be hard-coded.
> > i.e. when the glyph ordering of the font to be used is known,
> > we can write conversion rules like following:
> > 0x0f40, 0x0f72 => glyph ID xxxx
> > 0x0f40, 0x0f7f => glyph ID yyyy
> 
> Yes... Maybe the most appropriate solution is to make a special
> encoding for precombined Tibetan and include the mappings to that
> encoding in mlterm so it could use fonts in that encoding. Actually...
> there are already such encodings for nasty pre-Unicode hacks
> (basically they were fake latin1 fonts with Tibetan characters in
> place of the letters). For the sake of being able to use these legacy
> fonts too, they could be used as the basis for such an encoding.

After reviewing the situation, it seems the the TCRC font (the most
popular one) uses overstriking hacks wherever possible to fit all
glyphs in the single-byte range. This is obviously not acceptable as a
glyph encoding because the ability to use overstrike is very
font-specific and generally incorrect. There's another somewhat
established encoding, from the "Tibetan Machine" font, which could be
used instead. My understanding is that it contains all possible
stacks and does not use overstriking.

> > With Xft, drawing for non USASCII/ISO8859-1 characters are processed in 
> > xwindow/x_window.c:x_window_xft_draw_string32().
> > the function takes an array of UCS4 codes like 0x0f40, 0x0f72... .
> > 
> > So it may be possible to hack the function to watch input
> > and if the input sequence were Tibetan chars to be combined,
> > intercept them and call XftDrawGlyphs() instead.
> 
> Again I don't think special-casing Tibetan is appropriate. The same
> issues apply to all combining characters, including most South Asian
> languages, mathematical combining marks, etc. and they can all be
> handled in a unified way IMO.

Hmm, it seems like I said two rather conflicting things here. What I
meant to say is that, while supporting script-specific glyph encodings
separate from character encodings is a possible solution (and probably
results in maximal font quality), what we really need is a mechanism
for all fonts (including bitmap fonts) to carry with them anchor point
information for combining, so that the terminal emulator can perform
arbitrary combining/stacking rather than just a fixed set of glyphs.

Rich