Thank you very much. I did not know that tagsets could contain white spaces.
That explains why there are ".*" in the rule file. The french.dict has the same
format with english.dict except that POS tags in french.dict contain white
spaces, I initially thought the format is different for the french.dict (please
disregard the last question in my last email).
Again, thank you.
Regards,
Nathaniel
________________________________
From: Dominique Pellé <dominique.pelle@...>
To: NATHANIEL OCO <nathanoco@...>
Cc: languagetool-devel@...
Sent: Tue, May 17, 2011 2:10:46 PM
Subject: Re: [Languagetool] Multi-dimensional tagset
Nathaniel Oco wrote:
> Hi, good day.
>
> Thank you for replying to my last email. Below are entries from the French
> Tagging Dictionary:
> zythums zythum N m p
> zythum zythum N m s
> zythons zython N m p
> zython zython N m s
>
> Aside from tagging each word as noun, verb, etc., other details are also
> available (e.g. gender, plurality, etc.).
>
> How does one create a multi-dimensional tagset (not
> sure with the formal term)?
POS tags such as " N m s" are just strings, which happen
to contains space. Nothing "multi-dimentional" here.
> What is the format for the tagset?
Each language is free to choose the format of its POS tags.
French POS tags are described in: src/resource/fr/tagset.LT.txt
> Is the process the same in creating a tagging dictionary?
I did not understand the question. Same as what?
> Thank you very much.
>
> Best regards,
> Nathaniel Oco
Regards
-- Dominique
|