Could you describe please shortly the algorithm of named entity recognition, part of speech detection and sentence splitting which you use in OpenNLP.
And how do you train the language model for every of these approach?
How does it look training corpus and model?
If you have a literature about that please, write a URL, title of literature.
Best regards,
Ruslan.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The named-entity tagger uses a similar approach to the pos-tagger and its features are similar to those used in:
www.ldc.upenn.edu/acl/A/A97/A97-1029.pdf
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello all.
Could you describe please shortly the algorithm of named entity recognition, part of speech detection and sentence splitting which you use in OpenNLP.
And how do you train the language model for every of these approach?
How does it look training corpus and model?
If you have a literature about that please, write a URL, title of literature.
Best regards,
Ruslan.
Hi,
The sentence detector, and pos-tagger are trained almost identically to the work described in the following thesis:
ftp://ftp.cis.upenn.edu/pub/ircs/tr/98-15/98-15.ps.gz
The named-entity tagger uses a similar approach to the pos-tagger and its features are similar to those used in:
www.ldc.upenn.edu/acl/A/A97/A97-1029.pdf
Hope this helps...Tom