From: <mi...@mi...> - 2005-12-18 16:30:57
|
Hi. I've tried to see Tibetan text by following procedure. Please let me know you have used some different way or you are using another font. 1. install "Tibetan Machine Uni" from www.thdl.org and register the font to fontconfig library. 2. create ~/.mlterm/aafont and add: ISO10646_UCS4_1=Tibetan Machine Uni-iso10646-1; 3. run mlterm with anti-alias enabled: mlterm -A 4. display a Tibetan web page by w3m on mlterm: w3m "http://www.thdl.org/xml/show.php?xml=test/tibetnew/thdlhp.xml&lng=tib" On Mon, 2005-12-12 at 08:16 -0500, Rich Felker wrote: > first: my main motivation for wanting a full utf8 environment is > Tibetan language support. i want to be able to read and write tibetan > in email (mutt/emacs), irc/aim (irssi), etc. > > tibetan makes heavy use of combining characters -- up to 5 characters > in one cell in ordinary colloquial words. i know that mlterm has > support for combining characters (unlike the other useless "unicode" > terminals i've found), but my understanding is that it overlays > several glyphs to make the combined glyph. this is certainly usable, > but only at rather large font sizes -- large enough that i wouldn't > have very many columns on the screen. > > what i'd like is to use a bitmap font with all possible stacks > precombined and edited for readability. i've already played around > with this quite a bit and made a fully readable tibetan font at 8x24 > pixels; however, the information i've found online (*nix/unicode faq, > http://www.cl.cam.ac.uk/~mgk25/unicode.html) seems to indicate that > the x font system has no decent way of supporting precombined bitmap > glyphs when there is no precombined unicode character number. > > so my first question is: is there a way to do this already? or a sane > way to implement it without introducing hacks that are way too evil? > i'm willing to do really ugly hacks to support it if necessary (like > ignoring the x font system and loading the bitmap glyphs directly in > mlterm as a pixmap) but i'd be a lot more interested in doing it right > if there's a sane way to do it instead, so that it could get included > in the official mlterm source. Judging from the result of ftview 32 .fonts/TibetanMachineUniAlpha.ttf , the "Tibetan Machine Uni" font seems to contain many pre-combined glyphs. So it's a good news. If these glyph covers all combination of Tibetan consonants, you do not have to create another bitmap collection. The freetype library should be able to generate anti-alias aware images at whatever size you want. The bad news is, these glyphs do not have unicode code point and mlterm can't use them at present since neither Xft nor "X core font" works as desired in this case. # For Xft, it can create glyphs by over-striking corresponding glyphs. However, the result looks tend to be ugly and do not seems to be correct when more than two combining consonants are involved. So, it can be done in sane manner. If we can make mlterm to use freetype library directly, we can access glyph substitution table and pre-combined glyph. Alternatively, we may able to use m17n-lib (http://www.m17n.org/m17n-lib/) as a rendering engine. Though I'm not yet investigated deeply, implementation of mlterm side can simpler in this case. Supporting to edit Tibetan on command line / editors like vim can be very difficult. Because there's no standard for handling of combined characters on a terminal, you have to define "the right way" at first. ex.) - When you press Delete/Backspace key in a combined character, Should we delete entire character or part of it? - When you press right/left key, should a cursor move one consonant or one character? - What should we do if part of combined character was overwritten? So we should consider a bit deeper after finishing to implement proper glyph rendering scheme. > my second question is off-topic, but maybe someone here can answer: is > screen's utf8 support usable? (including combining characters?) a > quick rtfs seemed to indicate so but i don't have an environment for > testing it at present. i'm quite addicted to screen, so if its utf8 > support is broken i need to do something about that first... In my understanding, screen can pass utf-8 but do not care about combined characters. So there should be some breakage. I'm not using screen these days and not sure its current state, though. regards, -- minami <mi...@mi...> |