Re: [Indic-computing-devel] How are indic fonts rendered?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

--- Joseph Koshy <jk...@Fr...> wrote:

>    (B) these scripts use a number of glyph shapes
> representing
>        combinations of characters, so there isn't a 1-1
> mapping of
>        character encoding code points to glyphs.
> So, our rendering process has to map:
> 
>    `M' code points -> `N' language glyph shapes

Yes. All of the following mappings are possible. In the
bracket there are sample character codes :

Charcode(s)    Glyphcode(s)
one        ->  one    (e.g., U+0915)
one        ->  many   (e.g., U+0BCA)
many       ->  one    (e.g., Kssa conjunct in Devanagari)
many       ->  many   (e.g., Other consonant conjuncts)

> [Q: Are there any other issues to be taken care of when
> rendering indic scripts? ]

"Syllable breaking" is a major issue that we have to take
care of. A rule based algorithm is used to determine
boundaries of syllables in a given character string.
Following is the sequence :

(1) Take the input string
(2) Break it into various scripts

For each script run,
(3) Break the string depending upon various properties (if
applicable), e.g., colour of characters.
(4) Now break them into syllables

For each syllable,
(5) Reorder characters within the syllable
(6) Get glyph codes
(6) Apply Substitution to get new glyph codes
(7) Apply Positioning (if applicable)
(8) Render each glyph 

> 
> My questions are:
> 
>   - do we do reordering of glyphs (A) before looking for
> composite
>     glyphs (B), or is it best done the other way round?

Yes. Reordering is necessary before looking for the glyphs.
We reorder the characters so that some character sequence
which is converted into some attached glyph are placed on
top (or below) of base character. Now this base character
can only be determine from the property of character code.
We can not determine it from glyph codes unless we maintain
history. So it is easier to reorder the characters before
converting them into glyphs.

> 
>   - do (A) and (B) have to be done multiple times?

Yes. It has to be done for each syllable.

> 
>   - is there ONE algorithm that can handle correct glyph
> rendering for every indic script, or are the glyph
> selection/re-ordering algorithms language specific?

Syllable breaking logic is common for all Indic scripts. We
can classify each character in all Indic scripts. Syllable
breaking state machine uses these classes to determine
syllable boundary. 

Reordering of characters is however script specific.

Regards,
Keyur

__________________________________________________
Do You Yahoo!?
Send FREE Valentine eCards with Yahoo! Greetings!
http://greetings.yahoo.com