We currently create a stream of tagged tokens using 2 public parsers (inc. Brown) and our own chemical tagger OSCAR3 (https://sourceforge.net/projects/oscar3-chem/). This is then parsed into a syntax tree using the NLTK RegexpChunkParser (I think). We would like to move to Java and wonder if OpenNLP can be easily configured to do the same job.

We could create input for a regexpChunkParser in whatever syntax OpenNLP provides and a stream of tagged tokens (TAG token TAG token …)

Is there a simple way of doing this in OpenNLP?

Peter Murray-Rust
Dept of Chemistry
University of Cambridge UK