Yes, It can be automated. You can use affixcompress. Then, few extra rules can be added to the aff file - such as  "ණ <-> න".

On Thu, Aug 23, 2012 at 11:24 AM, Harshula <harshula@gmail.com> wrote:
Can the processing steps be automated in a shell script or makefile?
That way Parag can d/l the UCSC word list and build the final output
file.

On Thu, 2012-08-23 at 11:09 +0530, Sandaruwan Gunathilake wrote:
> The original word list is still available in the UCSC
> page : http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads
>
> I don't have the processed file at the moment - I'll dig up my backups
> and check whether I still have them. It's still in the firefox addon
> though : https://addons.mozilla.org/en-us/firefox/addon/sinhala-spellchecker/
>
> On Thu, Aug 23, 2012 at 10:45 AM, Harshula <harshula@gmail.com> wrote:
>         Hi Sandaruwan,
>
>         Parag (CC'd) is wondering where the upstream source tarball
>         for the word
>         list went?
>
>         cya,
>         #
>
>         On Mon, 2010-07-05 at 00:59 +0530, Sandaruwan Gunathilake
>         wrote:
>
>         > Hi,
>         >
>         > On Sun, Jul 4, 2010 at 11:57 PM, Harshula
>         <harshula@gmail.com> wrote:
>         >         Hi Sandaruwan,
>         >
>         >         On Sun, 2010-07-04 at 22:01 +0530, Sandaruwan
>         Gunathilake
>         >         wrote:
>         >         > What about the sinhala words list on UCSC language
>         lab page?
>         >         >
>         >         > http://www.ucsc.cmb.ac.lk/ltrl/?page=downloads
>         >         >
>         >         > I switched the word list to that in spellchecker
>         version
>         >         0.2.
>         >
>         >
>         >         The LTRL word list states it has 70142 distinct
>         Sinhala words.
>         >         si-LK.dic
>         >         appears to have 26707 words. Did you take a subset
>         of the
>         >         words from the
>         >         LTRL word list?
>         >
>         > No, everything is there. I just used compressed the words
>         list with
>         > "affixcompress" utility and added few extra rules at the top
>         of .aff
>         > file to support "ණ/න/ල/ළ", etc.
>         >
>         > --
>         > Best Regards,
>         > Sandaruwan Gunathilake
>
>
>
>
>
>
>
> --
> Best Regards,
> Sandaruwan Gunathilake
>





--
Best Regards,
Sandaruwan Gunathilake