Re: [sinhala-technical] collation: order of composed and decomposed dependent vowels
Brought to you by:
aratnaweera,
harshula
From: Ruvan W. <ar...@uc...> - 2008-11-30 18:14:23
|
great work harshula! Harshula wrote: > On Sun, 2008-10-12 at 17:10 +0530, Gihan Dias wrote: > >> Harshula wrote: >> >>> Hi Gihan & Ruvan, >>> >>> How should the composed and decomposed forms of dependent vowels be >>> sorted relative to each other? They can be considered equivalent, which >>> seems to be the standard practice of Indic scripts. >>> >>> >>> >> Harshula, >> >> As far as SLS1134 is concerned, the decomposed forms are not considered >> to be valid, and therefore, should be either marked as incorrect, or >> converted to the canonically composed form. >> >> Therefore, they would not be considered in collation. >> >> My recommendation is: >> >> 1. Never generate such decomposed forms. >> >> 2. Mark any such forms as errors. >> >> 3. If you are having an application such as a spelling checker which >> corrects errors in text, then convert any such sequences to the composed >> form. >> >> I do *not* recommend any application treating decomposed forms as valid >> representations of Sinhala text. >> > > Gihan, > > Thanks for the info, what I have done is to *not* add decomposed forms > in the collation sequence. > > We now have Sinhala collation in MySQL 6 > (http://dev.mysql.com/downloads/mysql/6.0.html), still alpha s/w, and > will have it in GNU libc 2.8 when it is released. > > cya, > # > > |