Dear Pradhan,

Thanks alot. In the waiting time, I also tried and find out a method:

This part I wrote

            AnnotationSet defaultAnnots = doc.getAnnotations();
//AnnotationSet annSet = ...;
String type = "Person";
//Get all "type" annotations
AnnotationSet persSet = defaultAnnots.get(type);

I believe equal to this in your part:
AnnotationSet personSet = (AnnotationSet) bindings.get("person");

but, when you write:
if (personSet.size() > 3)
what does this mean ?

For example personSet annotations contain:

A, B, C, B, B, D, E, B, O , X, Y , Z, C,
can the above "if" condition return me value "B" which appears more than 3 times?
If yes, its really smart method, because I have to write longer code to get it:

List persList = new ArrayList(persSet);
Collections.sort(persList, new gate.util.OffsetComparator());

Iterator persIter = persList.iterator();
     Annotation temp=(Annotation);
//                String nameOfAnnotation=temp.getType();
//     annotationSets.add(doc.getAnnotations((String);
       String content = stringFor(doc, temp);

Only From here, I can get  content as A, B, C ... separately and I start to count the time it appears.

That's for discussion purpose only. :)

Thank you all !


From: Shekhar Pradhan <>
To: Vuong Dao Nghe <>;
Sent: Wed, November 24, 2010 11:34:02 PM
Subject: Re: [gate-users] How to extract the words that annotated

Try this (assumes that on the LHS of the JAPE rule) you use the label "person" for occurrences of Person:
AnnotationSet personSet = (AnnotationSet) bindings.get("person");
if (personSet.size() > 3) //more than 3 Person annotations
{String personStr;
for (Annotation personAnn: AnnotationSet)
try {
      personStr = doc.getContent().getContent(
//Now you can do what you want with personStr
catch (InvalidOffsetException e)
        throw new GateRuntimeException(e);

Hope this helps.

Shekhar Pradhan

-----Vuong Dao Nghe <> wrote: -----

From: Vuong Dao Nghe <>
Date: 11/22/2010 03:21AM
Subject: [gate-users] How to extract the words that annotated

Hi all,

I am looking for the answer for this but long time couldn't find the correct way.

When you run a gate pipeline, then open the document, choose annotation Sets then tick an Annotation, for example Person, you will see all words related Person will be highlighted.
How can I print out all these highlighted words ? Now I can write in java to load the pipeline with document and all necessary PRs. But I dont know how is the code to print out the words the be long to Person annotation for example ?

Can I print the words annotated that appear for more than 3 times ?


Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
GATE-users mailing list