From: Rich F. <da...@ae...> - 2005-12-19 02:33:54
|
On Mon, Dec 19, 2005 at 01:30:47AM +0900, mi...@mi... wrote: > Hi. > > I've tried to see Tibetan text by following procedure. > Please let me know you have used some different way or > you are using another font. > > 1. install "Tibetan Machine Uni" from www.thdl.org and register the font to > fontconfig library. > > 2. create ~/.mlterm/aafont and add: > ISO10646_UCS4_1=Tibetan Machine Uni-iso10646-1; > > 3. run mlterm with anti-alias enabled: > mlterm -A > > 4. display a Tibetan web page by w3m on mlterm: > w3m "http://www.thdl.org/xml/show.php?xml=test/tibetnew/thdlhp.xml&lng=tib" I will try this soon. I don't yet have an environment on my own system where I can test these sort of things, but I do have a laptop I can test on which might have the prerequisite software. BTW thanks for mentioning that w3m supports unicode properly. I use elinks and love it but the support for non-latin encodings is total crap. :( > Judging from the result of > ftview 32 .fonts/TibetanMachineUniAlpha.ttf > , the "Tibetan Machine Uni" font seems to contain > many pre-combined glyphs. > > So it's a good news. > If these glyph covers all combination of Tibetan consonants, > you do not have to create another bitmap collection. > The freetype library should be able to generate > anti-alias aware images at whatever size you want. The problem is that I don't see any way to get readable glyphs in an 8x16 character cell from a scalable font. The hinting would have to be mindblowingly good for that to work at all. I ended up compromising with 8x24 with my bitmap fonts even though 8x16 could be made (barely) readable, but I don't have faith in FreeType/TibetanMachineUni font to give me something usable even with 8 extra pixels. > The bad news is, these glyphs do not have unicode code point and > mlterm can't use them at present since neither Xft nor "X core font" > works as desired in this case. > > # For Xft, it can create glyphs by over-striking corresponding glyphs. > However, the result looks tend to be ugly and do not seems to be correct > when more than two combining consonants are involved. If it can use the alternate-form glyphs when stacking, then it will be mostly correct, but still ugly enough that it's undesirable. And of course even more hopeless at small sizes. > So, it can be done in sane manner. > If we can make mlterm to use freetype library directly, > we can access glyph substitution table and pre-combined glyph. Is there any way to use a glyph substitution table from an external file and still use the fonts through X? That is to say: does X make the precombined glyphs available in any way at all, or are they completely inaccessible? If there's a sane way to make them accessible and just use a separate subst table it would work for bitmap fonts too, which would be a big plus. > Alternatively, we may able to use m17n-lib > (http://www.m17n.org/m17n-lib/) > as a rendering engine. Though I'm not yet investigated deeply, > implementation of mlterm side can simpler in this case. Yes I've been looking at m17n-lib a lot since I wrote. It seems to have the only reasonable Tibetan input methods on *nix (altho I was just planning on adding key bindings to screen to switch keyboard to tibetan mode and back). I haven't looked at its rendering stuff yet. I see that mlterm supports m16n-lib already; what does it use it for? > Supporting to edit Tibetan on command line / editors like vim can be > very difficult. Because there's no standard for handling of combined > characters on a terminal, you have to define "the right way" at first. > ex.) Some of these questions belong at the terminal level, some at the pty level, and some at the application level. > - When you press Delete/Backspace key in a combined character, > Should we delete entire character or part of it? > - When you press right/left key, should a cursor move one consonant or > one character? Hard to say. Either way would be usable. This is definitely left to the application however (or when using cooked line-based input at the terminal, the kernel tty driver). An interesting analogy is with wide cjk characters and wide 'characters' the tty driver generates when you enter control codes. It shows them as 2 character cells on the terminal, but when you delete them with backspace they're removed as one unit. > - What should we do if part of combined character was overwritten? I don't think this is possible. The terminal control codes to move to a position are based on character cells, not characters, right? Otherwise it would be impossible for programs to use the terminal to setup tabular data without keeping track of all the character widths in the line.. You can only position the cursor before or after a complete combining sequence, not in the middle of it. (Of course an advanced editor could still have a sense of "in the middle of it" not representible in the terminal.) On the other hand, what happens when you overwrite just half of a cjk doublewidth character? > So we should consider a bit deeper after finishing to implement proper > glyph rendering scheme. I agree, but I think these are mostly usability issues that could be improved incrementally once the essentials are in place. > >my second question is off-topic, but maybe someone here can answer: is > >screen's utf8 support usable? (including combining characters?) a > >quick rtfs seemed to indicate so but i don't have an environment for > >testing it at present. i'm quite addicted to screen, so if its utf8 > >support is broken i need to do something about that first... > > In my understanding, screen can pass utf-8 but do not care about > combined characters. So there should be some breakage. Well like I said, the code seems to be aware of character width, including both doublewidth and zerowidth... Not sure how it works though, and that's just from RTFS'ing briefly. > I'm not using screen these days and not sure its current state, though. :( Rich |