Re: [Indic-computing-devel] Re: NCST IndiX examined

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Sastry,

> I sincerely appreciate your efforts in trying to support Indian 
> Languages on Linux. 

Thank you for the appreciation of the teams work.

> After having shared the keen insights you have 
> gained by looking at both IndLinux(IITM) and Indix(NCST), what are 
> your final suggestions/recommendations?

I'm still looking at alternatives.  Some notes:

The X protocol supports a very simple model of text rendering,
namely:

  - an X font is a collection of glyphs, indexed by one of two
    schemes: linear and 2-D. 

  - text drawing calls specify a starting X,Y coordinate on a drawable
    entity and a list of glyph indices. The X server `draws' glyphs by
    placing the bits of the selected glyphs `next' to each other.

Indic scripts generally have a large number of glyph shapes,
representing combinations and conjuncts of the component `characters'.
The mapping between `character codes' and glyphs is complex, and is
also language and character encoding specific.

The NCST and IITM folks have tried to implement this mapping in the X
server and X library respectively, but have sacrificed X protocol
compatibility in the process.

The following broad approaches to rendering Indic text seem possible,
(without breaking anything, that is):

A) render glyphs entirely on the X client side, sending the
   final bitmap across.

   [pros]
   + will work on any X server

   [cons]
   - each client needs to 'know' the gory details of indian
     language text processing
   - clients need to 'know' about specific font technologies 
     (Type1/TTF/OpenType/...) and font encodings
   - additional network traffic compared to sending just glyph
     indices
   - no opportunity for the server to cache glyphs and fonts
   - doesn't allow the X server to use hardware knowledge effectively
     (e.g. sub-pixel positioning of glyphs on LCD displays)
   - changes the programming model at the client

B) Have the client do all the `character code' processing needed
   (re-ordering of glyphs, selection of glyphs for composite 
   characters, etc) and have it send over a list of (font
   specific) glyph indices.

   [pros]
   + will work on any X server
   + will allow the X server to use hardware features for text
     rendering
   + will allow for caching of font glyphs in the X server
   + low network load
   + allows clients to be independent of the font technology
     used by the X server
   + client programming model does not change drastically

   [cons]
   - clients need to 'know' the gory details of indian
     language text processing (how to reorder glyphs, how
     to select composite glyphs, etc.).
   - clients need to 'know' the font encodings used by the
     fonts served by the X server

   GTK+Pango appears to partially(?) follow this model.

C) An X server extension specially for Indic glyph drawing

   In this approach we add new drawing requests that allow text in the
   form of 'character codes' to be sent to the X server and have the X
   server process these characters appropriately, according to the
   language and the character encoding.

   [pros]
   + low network load
   + clients can be unaware of underlying font technology
   + the X server can use hardware features for text rendering
   + {?} clients can perhaps be written to deal with 
         'characters', and not glyphs.
   + {?} clients can perhaps be language and encoding independent.

   [cons]
   - text metrics needs a round-trip request/response from the server;
     this could result in dramatic slowdowns.
   - programming model is incompatible with the regular X model;
     clients require a rewrite of their text processing portions.

Of these alternatives, I prefer (B).  X servers in general are very
good at rendering text glyphs, this being one of the most important
areas of performance optimization.  Option (A), i.e. client side
rendering, is not able to take advantage of hardware speed ups, and
also burdens the client with a number of dependencies.  The situation
is not so bad that an option like Option (C) is required.

As mentioned earlier GTK+Pango appears to follow model (B).  However,
it is not generic enough for my taste; even if GTK+Pango could be
speeded up (it currently draws strings one glyph at a time), the
GTK+Pango indic algorithms appear to be specific to UNICODE.  I would
personally prefer a table-driven approach to reordering and character
composition (like that of the Graphite system from SIL).

In summary, adding language specific reordering and aggregation rules
to the client while leaving actual glyph rendering to the X server
seems a promising alternative today.

> When can we expect a complete release that would support atleast
> Hindi completely without breaking the compatibility with the X
> Window System protocol?

How much design/coding are you willing to do? :).

Regards,
Koshy
<jk...@fr...>