Re: [Indic-computing-devel] How are indic fonts rendered?
Status: Alpha
Brought to you by:
jkoshy
From: Guntupalli K. <kar...@fr...> - 2002-02-14 08:31:26
|
On Mon, 11 Feb 2002 23:05:32 -0800 (PST) jk...@Fr... (Joseph Koshy) wrote: > > > I'm trying to refine my understanding of the basic algorithms > involved in Indic glyph rendering, for future inclusion into the > Handbook. > A well researched model used in Win2k/XP (similar used in Indix) is documented at http://www.microsoft.com/typography/otspec/indicot/shaping.htm The complete document is at http://www.microsoft.com/typography/otspec/indicot/default.htm but the gist of above has already been said in another reply to this post, so I just cover the font issues. > [Note: You could possibly think of a character encoding where text > is encoded in "visual" order. Some transliteration schemes for > indian languages use such "visual" order encodings. ] > this is like the shusha font scheme uses. > (B) is a property of the script: most (all?) indic scripts have > special glyph shapes for double-consonants, consonants+vowel > combinations, etc. > > So, our rendering process has to map: > > `M' code points -> `N' language glyph shapes > > and in doing so we have to do glyph re-ordering "(A)" and composite > glyph selection "(B)". > > [Q: Are there any other issues to be taken care of when rendering > indic scripts? ] > > Some indian language fonts are designed to contain "partial glyphs"; > these fonts require a sequence of glyphs to be specified to render a > full language glyph on screen (for example, Baraha (Kannada)). For > such fonts, each of the `N' language glyph shapes selected above > will need to be mapped further into `O' font-specific glyph indices. > This was done because of the restriction in 8 bit fonts, where approx not more than 220 encoded glyphs for indian language can be put. More glyphs can be put but no code can be assigned if its to be an 8 bit font. All this mapping info you need to put in ur code , or in a seperate file. or with opentype font in the OT tables. > My questions are: > > - do we do reordering of glyphs (A) before looking for composite > glyphs (B), or is it best done the other way round? > Reordering is done at character code level. > - do (A) and (B) have to be done multiple times? > (A) once, (B) multiple times, at each step u look for a specific combination of glyphs, and do the substitution. > - is there ONE algorithm that can handle correct glyph rendering > for every indic script, or are the glyph selection/re-ordering > algorithms language specific? > glyph selection/reordering are script/language specific. Say like in Devanagari, only reordering needs to be done with , the VS 'I' & the 'RA' forms . In langauages like bengali, tamil etc. where you have surrounding vowels ( here vowel sign has 2 parts , that go to either side of the base consonant ). Also like hindi & marathi use same script, but there are variations in glyph shapes for some characters eg SHA (U0936) glyph in marathi is different than that used for hindi, so also for some digits like '8' , '9' The is as such no ONE algorithm, but since the microsoft way (Uniscribe + Opentype layout services library + Indic fonts ) has been (researched & ) documented well, it has become kind of 'defacto standard' of doing indic rendering. Pango is also follows the uniscribe model. Freetype project is working on a 'Freetype services layout' library which will to the opentype stuff in freetype. Regards, Karunakar |