I don't think I could use the flexible gazetteer to only process a subsection of the document. It still looks like it applies the lookups to the entire documents contents, it just allows for additional preprocessing to substitute tokens (temporarily) before the lookup step based on an annotation type and feature. It may be possible to make it replace all the surrounding text if i added annotations around those section with a feature value of "" which would then have the effect of deleting all but the section im interested in whilst doing the lookup step. Its another option to consider. For now I will stick to processing the whole documents while testing the rest of the process.

Thanks again

Tony

On Tue, Jun 30, 2009 at 8:24 PM, Diana Maynard <d.maynard@dcs.shef.ac.uk> wrote:
Hi Tony
Oh I see what you mean, I misunderstood. Are you sure that the gazetteer lookup will slow things down significantly, as usually it's pretty fast?
You could try using a flexible gazetteer in place of a regular one - I'm not sure if this would actually increase or decrease your processing time though. The advantage is that you can run the flexible gazetteer on a particular set of annotation types, so you could use it after you've run the Annotation Set Transfer. However I'm not sure if it would be appropriate in your situation, and also whether the increase in processing time would outweigh the benefits of not running it over the whole document. Might be worth investigating.
Your final alternative would be to create a new PR that does exactly what you want....
Hope that helps
Diana



Tony Scerri wrote:
Yes thats what I planned but like I said I will be picking up lots of Lookups throughout the document which i simply dont need, if I could target that to the portion of the document based on an annotation thats would cut down the processing time. In some cases the other parts of the document will be bigger than the section im interested, possibly up to 3 times as big. For now I will use this approach but I may need to look at alternatives to process larger sets of documents efficiently.

Tony