I've downloaded and experimented with OpenNLP, and in particular with Grok, for a day or so and have some questions.
1. Grok seems to be missing many features of a full English NLP system. It does not seem to handle verb tenses, irregular verbs, auxiliary verbs, agreement, movement, subcat, WH queries, etc. I assume that you have plans to add these and have, at least in outline, ideas about how they might be implemented. Could you please give some indication about future plans?
2. The feature system does not seem to support sets, these are probably needed for agreement and verb form. Have I missed it, are they still to be implemented or do you have a different method in mind?
3. How do you plan to add a lexicon? For testing a small lexicon is all that is required and this can be done manually. A full lexicon is a *lot* of work and I assume you have no plans to produce one, is there a suitable free lexicon we can use?
4. At a higher level, are there any plans to use an ontology? I understand that CYCL (www.cyc.com) are about to open-source their ontology and this might be suitable.
5. Finally, are there any tasks which I can help with. A couple of ideas are:
A high quality sentence break detector (i.e. Mr. M.R.King Jn. d.o.b. 20.5.65).
Phrase detection and taging ("John kicked the bucket").
Mike Atkinson
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I've downloaded and experimented with OpenNLP, and in particular with Grok, for a day or so and have some questions.
1. Grok seems to be missing many features of a full English NLP system. It does not seem to handle verb tenses, irregular verbs, auxiliary verbs, agreement, movement, subcat, WH queries, etc. I assume that you have plans to add these and have, at least in outline, ideas about how they might be implemented. Could you please give some indication about future plans?
2. The feature system does not seem to support sets, these are probably needed for agreement and verb form. Have I missed it, are they still to be implemented or do you have a different method in mind?
3. How do you plan to add a lexicon? For testing a small lexicon is all that is required and this can be done manually. A full lexicon is a *lot* of work and I assume you have no plans to produce one, is there a suitable free lexicon we can use?
4. At a higher level, are there any plans to use an ontology? I understand that CYCL (www.cyc.com) are about to open-source their ontology and this might be suitable.
5. Finally, are there any tasks which I can help with. A couple of ideas are:
A high quality sentence break detector (i.e. Mr. M.R.King Jn. d.o.b. 20.5.65).
Phrase detection and taging ("John kicked the bucket").
Mike Atkinson
I've responded to this post on the Grok open discussion forum:
http://sourceforge.net/forum/forum.php?thread_id=649437&forum_id=12320
Jason