Thanks for all the help with tagger, which works fine now :)
As i had mentioned before, I am trying to break down incoming sentences into Subject, Object, Predicate form. What is the best way to go about doing it using OpenNLP?
Any pointers to literature/previous work will be useful. I am a newbie to NLP.
Cheers,
Nitin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I can give you a couple pointers but in general there is not a standard answer to this question, it kinda depends on what your using it for. Finding out your requirements and suggesting a plan based on them is a lengthy process and goes by the name consulting. Here are a couple things off the top of my head:
o There are SBJ tags in the Penn treebank which are stripped off when training the parser. If they give you enough info for your task then you could re-train the parser using them or build a model to add them to your parses.
o PropBank address this in a more complete fashion. There are many approaches to producing PropBank style annotation automatically with and without parsing. Search http://acl.ldc.upenn.edu/ for PropBank and see what you come up with. Generating PropBank annotation w/o parses was the CoNLL-2004 shared task so those proceedings will have a lot of current approaches to doing this.
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Tom, Gann,
Thanks for all the help with tagger, which works fine now :)
As i had mentioned before, I am trying to break down incoming sentences into Subject, Object, Predicate form. What is the best way to go about doing it using OpenNLP?
Any pointers to literature/previous work will be useful. I am a newbie to NLP.
Cheers,
Nitin
Hi,
I can give you a couple pointers but in general there is not a standard answer to this question, it kinda depends on what your using it for. Finding out your requirements and suggesting a plan based on them is a lengthy process and goes by the name consulting. Here are a couple things off the top of my head:
o There are SBJ tags in the Penn treebank which are stripped off when training the parser. If they give you enough info for your task then you could re-train the parser using them or build a model to add them to your parses.
o PropBank address this in a more complete fashion. There are many approaches to producing PropBank style annotation automatically with and without parsing. Search http://acl.ldc.upenn.edu/ for PropBank and see what you come up with. Generating PropBank annotation w/o parses was the CoNLL-2004 shared task so those proceedings will have a lot of current approaches to doing this.
Hope this helps...Tom