From: Jimmy O'R. <jo...@gm...> - 2011-05-14 00:02:05
|
On 14 May 2011 00:47, Paulo Schreiner <pa...@jo...> wrote: > Em Sex, 2011-05-13 às 23:45 +0100, Jimmy O'Regan escreveu: >> On 13 May 2011 22:55, Paulo Schreiner <pa...@jo...> wrote: >> > Anyone here has some experience with the apertium tagger? >> > >> > I have created (to my best knowledge) all required resources, but got >> > stuck with the following error: >> > >> > apertium-tagger -d -s 0 pt.expand pt.tagged.txt pt.tsx pt.prob pt.tagged >> > pt.tagged.morf >> > Calculating ambiguity classes... >> > >> > 30 states and 31 ambiguity classes >> > Kupiec's initialization of transition and emission probabilities... >> > Initializing transition and emission probabilities from a hand-tagged >> > corpus... >> > {adv} Word: depois -- {prp,adv} Word: depois >> > Error: A new ambiguity class was found. I cannot continue. >> > Word 'depois' not found in the dictionary. >> > New ambiguity class: {prp,adv} >> > Take a look at the dictionary, then retrain. >> >> 'depois' needs to be added to the dictionary (as both preposition and >> adverb), to match the corpus. In all likelihood, the word is present >> (otherwise it couldn't have encountered an ambiguity), so you'll >> probably need to look at the commands in the Makefile that are used to >> filter the output of lt-expand - it's discarding too much. >> > > Like this? I sorted the expanded file, seems they are there. > > depois:depois<adv> > Depois:depois<adv> > depois:depois<prp> > Depois:depois<prp> > > Any other idea? No need for another idea, because I'm right :P That's the wrong format. It should match the output of the analyser (i.e., you should have entries like: ^depois/depois<pr>/depois<adv>$ instead of what you have). -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you |