Re: [sinhala-technical] collation: order of composed and decomposed dependent vowels
Brought to you by:
aratnaweera,
harshula
From: Harshula <har...@gm...> - 2008-11-29 12:37:14
|
On Sun, 2008-10-12 at 17:10 +0530, Gihan Dias wrote: > Harshula wrote: > > Hi Gihan & Ruvan, > > > > How should the composed and decomposed forms of dependent vowels be > > sorted relative to each other? They can be considered equivalent, which > > seems to be the standard practice of Indic scripts. > > > > > Harshula, > > As far as SLS1134 is concerned, the decomposed forms are not considered > to be valid, and therefore, should be either marked as incorrect, or > converted to the canonically composed form. > > Therefore, they would not be considered in collation. > > My recommendation is: > > 1. Never generate such decomposed forms. > > 2. Mark any such forms as errors. > > 3. If you are having an application such as a spelling checker which > corrects errors in text, then convert any such sequences to the composed > form. > > I do *not* recommend any application treating decomposed forms as valid > representations of Sinhala text. Gihan, Thanks for the info, what I have done is to *not* add decomposed forms in the collation sequence. We now have Sinhala collation in MySQL 6 (http://dev.mysql.com/downloads/mysql/6.0.html), still alpha s/w, and will have it in GNU libc 2.8 when it is released. cya, # |