Thanks for the first link. I will use it in spec file and build a new hunspell-si package.


On 23/08/12 11:42, Sandaruwan Gunathilake wrote:
Here you go : http://www.sandaru1.com/si-LK.tar.gz

I also uploaded it to github if anyone is interested in maintaining it : https://github.com/sandaru1/si-LK

On Thu, Aug 23, 2012 at 11:33 AM, Parag Nemade <pnemade@redhat.com> wrote:
Hi Sandaruwan,
   I do have this file at http://paragn.fedorapeople.org/si-LK.tar.gz but I need some upstream source download URL. If you can host it somewhere then that will be helpful.

On 23/08/12 11:24, Harshula wrote:
Can the processing steps be automated in a shell script or makefile?
That way Parag can d/l the UCSC word list and build the final output

On Thu, 2012-08-23 at 11:09 +0530, Sandaruwan Gunathilake wrote:
The original word list is still available in the UCSC
page : http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads

I don't have the processed file at the moment - I'll dig up my backups
and check whether I still have them. It's still in the firefox addon
though : https://addons.mozilla.org/en-us/firefox/addon/sinhala-spellchecker/

On Thu, Aug 23, 2012 at 10:45 AM, Harshula <harshula@gmail.com> wrote:
         Hi Sandaruwan,
                  Parag (CC'd) is wondering where the upstream source tarball
         for the word
         list went?
                  On Mon, 2010-07-05 at 00:59 +0530, Sandaruwan Gunathilake
                  > Hi,
         > On Sun, Jul 4, 2010 at 11:57 PM, Harshula
         <harshula@gmail.com> wrote:
         >         Hi Sandaruwan,
         >         On Sun, 2010-07-04 at 22:01 +0530, Sandaruwan
         >         wrote:
         >         > What about the sinhala words list on UCSC language
         lab page?
         >         >
         >         > http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads
         >         >
         >         > I switched the word list to that in spellchecker
         >         0.2.
         >         The LTRL word list states it has 70142 distinct
         Sinhala words.
         >         si-LK.dic
         >         appears to have 26707 words. Did you take a subset
         of the
         >         words from the
         >         LTRL word list?
         > No, everything is there. I just used compressed the words
         list with
         > "affixcompress" utility and added few extra rules at the top
         of .aff
         > file to support "ණ/න/ල/ළ", etc.
