Hi,
I am wondering whether I can specify patterns that sentence detector must consider as a sentence break. For example, in my corpus, every sentence ends with a period followed by two spaces.
Thanks very much,
Yuan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
With what you have specified it should be able to detect those type of sentences. Since they end with periods.
I ran it my self on these sentences and it seemed to work ok. "The dog crossed the road. The cat is fluffy. The house is big."
However, if you want to use different sentence endings other than the default ones '. ! ? " )' you would have to train a new model. Although I not sure how to go about this, but what you probably want to look at is the
method
train(File inFile, int iterations, int cut, EndOfSentenceScanner scanner)
in
class
SentenceDetectorME
hope it helps
mark
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I am wondering whether I can specify patterns that sentence detector must consider as a sentence break. For example, in my corpus, every sentence ends with a period followed by two spaces.
Thanks very much,
Yuan
With what you have specified it should be able to detect those type of sentences. Since they end with periods.
I ran it my self on these sentences and it seemed to work ok. "The dog crossed the road. The cat is fluffy. The house is big."
However, if you want to use different sentence endings other than the default ones '. ! ? " )' you would have to train a new model. Although I not sure how to go about this, but what you probably want to look at is the
in
hope it helps
mark