From: Ian R. <i.r...@dc...> - 2006-11-30 18:35:32
|
Brian Lucas wrote: > Hi Ian, thanks for the advice. I mistakenly assumed all of the .jar's were > loaded in CLASSPATH but after explicitly naming them, StandAloneAnnie works > perfect. > > Two questions: > > 1. Similar to the ANNIE demo, which class is responsible for annotating that > a match has other matches for the same entity? E.g. > (gate:matches="5796;5728;6030;6032;6035;6036;6041" gate:gateid="6030") The coreferencer (http://gate.ac.uk/sale/tao/#sec:annie:pronom-coref for details). Note that this module is not a default part of ANNIE (i.e. not one of ANNIEConstants.PR_NAMES) so you'll have to add it to the end of your application pipeline if you need it. > 2. I'm going to be running GATE/ANNIE in Tomcat and feeding strings via > POST/GET. What is the most memory/speed efficient way to load it once and > run it 100,000 times? Perhaps just create a servlet that loads itself 'on > start' in Tomcat, runs Gate.init once, and waits for a request? Something like this - in init() you call Gate.init() and create your application Controller. In doGet/doPost you read the text from the request, create a GATE Document, pass it to the controller, do something with the results and release the document (with Factory.deleteResource). You just have to watch out for multi-threading issues - you can't use the same Controller or PRs in different threads at the same time, so if you have a single application you will only be able to serve one request at a time. When I do this kind of thing I create a number of identical copies of the controller in my init() and put them into a pool, so each request gets one copy of the application from the pool and returns it when the request is finished. You need one copy of the application for every request you want to process concurrently, i.e. with a pool of 3 apps you can serve three clients at once. Also, if you're going to use GATE in Tomcat I suggest you use the latest nightly build rather than the 3.1 release, as a number of bugs relating to multithreaded behaviour have recently been fixed. Ian -- Ian Roberts | Department of Computer Science i.r...@dc... | University of Sheffield, UK |