From: Wim P. <W.P...@dc...> - 2007-03-02 13:13:59
|
Hi Sandhya, You might try to run this Awk program over your data file INFILE with = the following data structure: Word,WordSynon1,WordSynon2... Syntax: gawk -f synonym.awk INFILE > synonym.jape It will create a jape file that provides synonyms with a an annotation=20 "Word". Hope this helps. Best wishes, Wim NLP Group Department of Computer Science University of Sheffield, U.K. Tel.: #44-114-2221902 Email: W.P...@dc... ---------------------------------------------------------- BEGIN { FS=3DOFS=3DSUBSEP=3D"," print "Phase: SynonymAnnotation" print "Input: Lookup Token" print "Options: control =3D appelt" } { Word=3D$1 for (x=3D2;x<=3DNF;x++) { WordSynon=3D$x count2++ print "Rule: synon" count2 print "(" print "\{Token.string =3D=3D \"" WordSynon "\"\}" print "):syn" print "-->" print ":syn." Word " =3D \{kind =3D \"" WordSynon "\"\}" } } > -----Original Message----- > From: gat...@li... [mailto:gate-users- > bo...@li...] On Behalf Of Diana Maynard > Sent: 27 February 2007 10:04 > To: Revuri, Sandhya (MSAS Sys Dev IBD) > Cc: gat...@li... > Subject: Re: [gate-users] Annie OrthoMatcher >=20 > Hi Sandya > I would say that using the orthomatcher to do this is probably not the > best idea (though it could be possible with some modification). > The orthomatcher works by looking for pairs of matches in the text and > creating a list of matches for each annotation (that has one or more > matches). There is an element of the orthomatcher that looks for = Unknown > annotations, and if a match is found, it renames the annotation to the > annotation type of the match. But it requires both matching elements > (text strings) to be in the text. >=20 > Alternatively, trying to do it with JAPE rules could be possible but > would be quite awkward. >=20 > To be honest your best bet would be to write your own Processing > Resource to do this job, in my opinion. Someone else may have a better > idea, however. >=20 > Regards > Diana >=20 > Revuri, Sandhya (MSAS Sys Dev IBD) wrote: > > Hello > > > > I'm new to GATE. > > > > Problem Statement: > > --------------------------------- > > I have a set of words which forms my vocabulary. Each word has got = some > > set of synonyms. > > > > For a given document, it has to lookup the vocabulary in such a way = that > > if it finds the word, its ok, otherwise it should look for the = synonyms > > and tag with the word instead of the synonym. > > > > For Ex: the vocabulary is > > > > Word - > WordSynon1, WordSynon2. > > > > If a file text1 doesn't have "Word" but has got "WordSynon2" it = should > > be tagged/annotated as "Word". > > > > Is this kind of lookup possible in GATE using OrthoMatcher? If so = how? > > > > Thanks in Advance > > > > >=20 > = -------------------------------------------------------------------------= > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to = share > your > opinions on IT & business topics through brief surveys-and earn cash > = http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV > _______________________________________________ > GATE-users mailing list > GAT...@li... > https://lists.sourceforge.net/lists/listinfo/gate-users |