From: Francis T. <ft...@pr...> - 2010-06-14 15:03:51
|
El dl 14 de 06 de 2010 a les 16:45 +0200, en/na mikel otxandorena va escriure: > Hi, I am a student who is doing a small project. I have to do > something similar to stemming of porter for the basque language and > maybe you can help me telling me if you have some open code in your > project!! Kaixo Mikel, I'm not entirely sure how the porter stemmer works, but suspect that it is probably just removing common suffixes from English words (e.g. -ing, -ised, -ed etc.) There are morphological analysers for Basque that will do the same trick. The IXA group has one 'Xuxen' I think it is called, but it isn't open source. There is a partially conversion of this in the Apertium project: https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-eu-es/apertium-eu-es.eu.dix But your mileage might vary. The other thing you could look at is the hunspell Basque spellchecker (see e.g. hunspell-eu-es in Debian/Ubuntu), which has a rather long affix file eu-ES.aff. Hope this helps, Fran |