In the directory of sample\pipe, SimpePipe demonstrates the combination of using all preprocessor e.g. SD, Tokenizer, POS tagger, etc.
-What is the training data size for each one? and the accurray?
-Did these preprocessor use the data set from WSJ?
-Do you recommend these trained preprocessor are ready to use for general IE purpose? (I mean is it trained "enough")
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I wouldn't use the Pipeline stuff from Grok --- it isn't being developed anymore. The components themselves are still being developed, but they now live in the OpenNLP project space --- see http://opennlp.sf.net.
Jason
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In the directory of sample\pipe, SimpePipe demonstrates the combination of using all preprocessor e.g. SD, Tokenizer, POS tagger, etc.
-What is the training data size for each one? and the accurray?
-Did these preprocessor use the data set from WSJ?
-Do you recommend these trained preprocessor are ready to use for general IE purpose? (I mean is it trained "enough")
I wouldn't use the Pipeline stuff from Grok --- it isn't being developed anymore. The components themselves are still being developed, but they now live in the OpenNLP project space --- see http://opennlp.sf.net.
Jason