Re: [Indic-computing-devel] Script specific features

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Feb 27, 2002 at 09:49:23AM +0530, Rajkumar S wrote:
> 
> Supporting UTF-8 is not enough. We need some mechanism so that language
> specific sorting algorithms can be applied to UTF-8 data.
> 

For Linux, there is a hi_IN locale in glibc, which has a sorting order specified.
It was created by someone in Japan, working for IBM (isn't it ironic for
a country of 1 billion people, boasting hundreds of thousands of
programmers ?).

The language experts on this list should review the sorting order
specified there.

In the absence of a locale specifying a sorting order, I think it falls
back to number comparision of the unicode code points.

FreeBSD didn't support UTF-8 locales last I looked, but there is a ISCII based
locale that I contributed. Again, language experts please review.

The sorting order is encapsulated in the LC_COLLATE section for Linux:

http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/locales/hi_IN?rev=1.2&content-type=text/x-cvsweb-markup&cvsroot=glibc

http://www.freebsd.org/cgi/cvsweb.cgi/src/share/colldef/hi_IN.ISCII-DEV.src?rev=1.1&content-type=text/x-cvsweb-markup

	-Arun