Re: [Indic-computing-devel] Script specific features

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Wed, 27 Feb 2002, Keyur Shroff wrote:

> Tomorrow someone from Kerala (don't remember his name) called me up at
> NCST and asked about this "Chillaksharam" problem.

You mean Yesterday? :)

> separate code points could have been assigned to all Akhand in Indic
> scripts.

What are the advantages of having a separate code point? I am against
encoding conjuncts in Unicode. Just because for Latin has these encoded do
not mean that we need to have them. Actually it breaks the rule of
Unicode that only characters are encoded. Akhand is a rendering problem
and the solution should also be in rendering engine.

> I am sure that in the next coming proposal our Government has proposed
> to include all Akhand for a separate code point in Unicode.

Wouldn't that mess up the sorting rules? I guess unless we can find some
unencoded "characters" we should not bother with expanding the character
set. Vedic Characters are one example which comes to my mind that requires
encoding.

> I'll try to gather some information on sorting order. Many database
> including Oracle now supports UTF-8 format of Unicode.

Supporting UTF-8 is not enough. We need some mechanism so that language
specific sorting algorithms can be applied to UTF-8 data.

raj