I'd like to add a template method to the sentDetect/sentPosDetect methods in SentenceDetectorME. The purpose of the method is to allow a subclass to implement behavior that will eliminate false positives reported by the maxent model. In the SentenceDetectorME class, the method would be a no-op, so the current behavior will continue to work as it does today.
The method I propose to add is this:
/**
* Allows subclasses to check an overzealous (read: poorly trained) model from flagging obvious non-breaks as breaks based on some boolean determination of a break's acceptability.
*
* @param s the string in which the break occured.
* @param fromIndex the start of the segment currently being evaluated
* @param candidateIndex the index of the candidate sentence ending
* @return true if the break is acceptable
*/
protected boolean isAcceptableBreak(String s, int fromIndex, int candidateIndex) {
return true;
}
It would be invoked in sentDetect/sentPosDetect like this:
if (model.getBestOutcome(probs).equals("T") &&
isAcceptableBreak(s,index,cint)) {
Any objections?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'd like to add a template method to the sentDetect/sentPosDetect methods in SentenceDetectorME. The purpose of the method is to allow a subclass to implement behavior that will eliminate false positives reported by the maxent model. In the SentenceDetectorME class, the method would be a no-op, so the current behavior will continue to work as it does today.
The method I propose to add is this:
/**
* Allows subclasses to check an overzealous (read: poorly trained) model from flagging obvious non-breaks as breaks based on some boolean determination of a break's acceptability.
*
* @param s the string in which the break occured.
* @param fromIndex the start of the segment currently being evaluated
* @param candidateIndex the index of the candidate sentence ending
* @return true if the break is acceptable
*/
protected boolean isAcceptableBreak(String s, int fromIndex, int candidateIndex) {
return true;
}
It would be invoked in sentDetect/sentPosDetect like this:
if (model.getBestOutcome(probs).equals("T") &&
isAcceptableBreak(s,index,cint)) {
Any objections?
Nice solution -- go for it!