From: Rich F. <da...@ae...> - 2006-03-08 08:46:04
|
On Tue, Mar 07, 2006 at 02:45:57AM +0900, MINAMI Hirokazu wrote: > > I finally succeeded in getting mlterm to display Tibetan on my laptop, > > but it's horribly misrendered. Here's a screenshot: > > > > http://brightrain.aerifal.cx/~dalias/tibetan_misrender.png > > You have to enable "variable column width" and Xft support > to render some character as zero-width. > Under the configuration, simple over-striking with vertical > shift can be performed. > > What I saw was: > http://mistfall.net/minami/tmp/tibetan_w3m.png > > which is not looks so bad to me. Nope, it's horribly misrendered. All of the combining characters are displayed a whole character-cell to the left of where they belong. I don't see any characters that require a special form for combining, but I expect those would be wrong too. > > I also got around to reading the mlterm source, and it seems that > > there's no effort made to position combining characters; they're just > > blindly displayed relative to the same origin as the base character > > with overstrike. > > For combined character in Arabic, mlterm perform combining by > converting a group of non-combined UCS4 codes to a combined form. > Since they cannot be rendered simply vertically stacking each glyphs. AFAIK these are not combining characters but simply ligatures. > The code and its conversion table is in ml_shape.c. > > For Tibetan, however, the same approach cannot be used because > combined form of Tibetan glyphs are not registered in Unicode. Combining and shaping are two completely different issues as far as unicode/ucs semantics go. > > Personally I'm of the opinion that, except for accents on latin > > characters and other simple diacritics, automatic generation of > > combined glyphs will never look great, so I think we need some sort of > > solution for using precombined glyphs. If there were an artificial > > non-unicode encoding, then mlterm could get to the glyphs that way > > similarly to how it can use legacy-encoded Japanese, etc. fonts even > > in unicode mode. And of course it would be a big bonus if precombined > > bitmap glyphs could be used as well. > > The plan I had was: > > By assuming specific font, the conversion table can be hard-coded. > i.e. when the glyph ordering of the font to be used is known, > we can write conversion rules like following: > 0x0f40, 0x0f72 => glyph ID xxxx > 0x0f40, 0x0f7f => glyph ID yyyy Yes... Maybe the most appropriate solution is to make a special encoding for precombined Tibetan and include the mappings to that encoding in mlterm so it could use fonts in that encoding. Actually... there are already such encodings for nasty pre-Unicode hacks (basically they were fake latin1 fonts with Tibetan characters in place of the letters). For the sake of being able to use these legacy fonts too, they could be used as the basis for such an encoding. > With Xft, drawing for non USASCII/ISO8859-1 characters are processed in > xwindow/x_window.c:x_window_xft_draw_string32(). > the function takes an array of UCS4 codes like 0x0f40, 0x0f72... . > > So it may be possible to hack the function to watch input > and if the input sequence were Tibetan chars to be combined, > intercept them and call XftDrawGlyphs() instead. Again I don't think special-casing Tibetan is appropriate. The same issues apply to all combining characters, including most South Asian languages, mathematical combining marks, etc. and they can all be handled in a unified way IMO. > ... and we can replace the converter from UCS4 to glyph ID > to be able to handle any font using libotf/freetype/ or something. According to someone I've been working with on unrelated stuff, freetype will automatically do all the combined glyph rendering for you using the opentype info in the font file. Will Xft make use of that? Or is some huge bloat like pango needed? BTW, this won't work for bitmap fonts, and ideally terminals should be using bitmap fonts.. > Unfortunately, I don't have enough time to do this for a while. > Patches are welcome, of course. :) In the meantime I'm working on other components of my system overhaul, but I'll look at this again when I get to it. Thanks for the replies. Rich |