|
From: Dwayne B. <dw...@tr...> - 2004-01-23 08:42:28
|
Ysbeer wrote: > I have an additional list of words (All UTF8-encoded) which can be added > to the word list. I have managed to compile this list by trolling and > spidering various afrikaans sources. There is in excess of 161,000 words > in this list. Excellent :) This makes me very happy. Did you use any tools to do the trawling. I'd love you to add them to CVS if they exist. > This list can obvisouly be shorteneded by removing the > plural and diminiutive forms of words. Nope we need all valid words and combinations. MySpell and Ispell use affix compression which shortens the wordlist by encoding derived words. eg. helper helpers becomes help/S rules S tells the spellchecker to expand the word. > Would anyone be interested in taking up this list? I can check the basic > list into CVS in that would help anyone. If you could add the new words to the wordlist that would be appreciated. I usually do it as such: export LC_ALL=af_ZA (to ensure I have teh Afrikaans sort order) make a copy of wordlist and remove the header sort newwords > newwords.sort diff -u wordlist.noheader newwords.sort > x.diff vim x.diff (check for new words) diff -ui (is also usefull as it produces a list of totally unique words - ignoring changes in case) Pleas update the CREDITS and ChangLog file aswell I think just add all new words. But if you could email the new words to the list I'd love someone to check them. Although 30000 new words could be a problem for anyone to check. -- regards Dwayne Bailey Translate.org.za - translating Opensource software into all South African languages |