The original word list is still available in the UCSC page : http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads
I don't have the processed file at the moment - I'll dig up my backups and check whether I still have them. It's still in the firefox addon though : https://addons.mozilla.org/en-us/firefox/addon/sinhala-spellchecker/
Parag (CC'd) is wondering where the upstream source tarball for the word
On Mon, 2010-07-05 at 00:59 +0530, Sandaruwan Gunathilake wrote:
> On Sun, Jul 4, 2010 at 11:57 PM, Harshula <firstname.lastname@example.org> wrote:
> Hi Sandaruwan,
> On Sun, 2010-07-04 at 22:01 +0530, Sandaruwan Gunathilake
> > What about the sinhala words list on UCSC language lab page?
> > http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads
> > I switched the word list to that in spellchecker version
> The LTRL word list states it has 70142 distinct Sinhala words.
> appears to have 26707 words. Did you take a subset of the
> words from the
> LTRL word list?
> No, everything is there. I just used compressed the words list with
> "affixcompress" utility and added few extra rules at the top of .aff
> file to support "ණ/න/ල/ළ", etc.
> Best Regards,
> Sandaruwan Gunathilake