Menu

question about SDContextGenerator

2001-10-22
2001-10-23
  • Eric Friedman

    Eric Friedman - 2001-10-22

    in SDContextGenerator, the value of "prefixStart" and "suffixEnd" is arrived at by looking for the furthest whitespace from the candidate sentence ender in either direction, using PerlHelp.previousSpaceIndex and PerlHelp.nextSpaceIndex, respectively.

    Those values are then corrected with a scan that looks for `.' between (previous|next)SpaceIndex and the index of the candidate sentence ender.

    Why is `.' treated specially in this case?  What about `!' `?' and so forth?

    More generally, what's the rational behind this special case?

     
    • Jason Baldridge

      Jason Baldridge - 2001-10-23

      Good question. Gann coded that, so maybe he knows?  Looking quickly at it, I don't see why the other characters shouldn't be scanned for as well.

       
    • Gann Bierner

      Gann Bierner - 2001-10-23

      Actually, I'm almost certain I did this to get better results with acronyms containing "." (A.S.P.C.A., U.S.A., etc).  I'm peeved with myself for not commenting that (sorry).  I'll think about it and try to recall the exact problem and why this solved it.

      Anyway, obviously this does not require checking for "?" or "!".

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.