From: Rich F. <da...@ae...> - 2006-03-08 23:01:38
|
On Wed, Mar 08, 2006 at 03:48:47AM -0500, Rich Felker wrote: > On Tue, Mar 07, 2006 at 02:45:57AM +0900, MINAMI Hirokazu wrote: > > The plan I had was: > > > > By assuming specific font, the conversion table can be hard-coded. > > i.e. when the glyph ordering of the font to be used is known, > > we can write conversion rules like following: > > 0x0f40, 0x0f72 => glyph ID xxxx > > 0x0f40, 0x0f7f => glyph ID yyyy > > Yes... Maybe the most appropriate solution is to make a special > encoding for precombined Tibetan and include the mappings to that > encoding in mlterm so it could use fonts in that encoding. Actually... > there are already such encodings for nasty pre-Unicode hacks > (basically they were fake latin1 fonts with Tibetan characters in > place of the letters). For the sake of being able to use these legacy > fonts too, they could be used as the basis for such an encoding. After reviewing the situation, it seems the the TCRC font (the most popular one) uses overstriking hacks wherever possible to fit all glyphs in the single-byte range. This is obviously not acceptable as a glyph encoding because the ability to use overstrike is very font-specific and generally incorrect. There's another somewhat established encoding, from the "Tibetan Machine" font, which could be used instead. My understanding is that it contains all possible stacks and does not use overstriking. > > With Xft, drawing for non USASCII/ISO8859-1 characters are processed in > > xwindow/x_window.c:x_window_xft_draw_string32(). > > the function takes an array of UCS4 codes like 0x0f40, 0x0f72... . > > > > So it may be possible to hack the function to watch input > > and if the input sequence were Tibetan chars to be combined, > > intercept them and call XftDrawGlyphs() instead. > > Again I don't think special-casing Tibetan is appropriate. The same > issues apply to all combining characters, including most South Asian > languages, mathematical combining marks, etc. and they can all be > handled in a unified way IMO. Hmm, it seems like I said two rather conflicting things here. What I meant to say is that, while supporting script-specific glyph encodings separate from character encodings is a possible solution (and probably results in maximal font quality), what we really need is a mechanism for all fonts (including bitmap fonts) to carry with them anchor point information for combining, so that the terminal emulator can perform arbitrary combining/stacking rather than just a fixed set of glyphs. Rich |