in SDContextGenerator, the value of "prefixStart" and "suffixEnd" is arrived at by looking for the furthest whitespace from the candidate sentence ender in either direction, using PerlHelp.previousSpaceIndex and PerlHelp.nextSpaceIndex, respectively.
Those values are then corrected with a scan that looks for `.' between (previous|next)SpaceIndex and the index of the candidate sentence ender.
Why is `.' treated specially in this case? What about `!' `?' and so forth?
More generally, what's the rational behind this special case?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Actually, I'm almost certain I did this to get better results with acronyms containing "." (A.S.P.C.A., U.S.A., etc). I'm peeved with myself for not commenting that (sorry). I'll think about it and try to recall the exact problem and why this solved it.
Anyway, obviously this does not require checking for "?" or "!".
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
in SDContextGenerator, the value of "prefixStart" and "suffixEnd" is arrived at by looking for the furthest whitespace from the candidate sentence ender in either direction, using PerlHelp.previousSpaceIndex and PerlHelp.nextSpaceIndex, respectively.
Those values are then corrected with a scan that looks for `.' between (previous|next)SpaceIndex and the index of the candidate sentence ender.
Why is `.' treated specially in this case? What about `!' `?' and so forth?
More generally, what's the rational behind this special case?
Good question. Gann coded that, so maybe he knows? Looking quickly at it, I don't see why the other characters shouldn't be scanned for as well.
Actually, I'm almost certain I did this to get better results with acronyms containing "." (A.S.P.C.A., U.S.A., etc). I'm peeved with myself for not commenting that (sorry). I'll think about it and try to recall the exact problem and why this solved it.
Anyway, obviously this does not require checking for "?" or "!".