Menu

#235 Galago: TagTokenizer misses some single-character tokens

v5.x
closed
galago (57)
1
2014-06-19
2014-04-28
Weize Kong
No

org.lemurproject.galago.core.parse.TagTokenizer will tokenize "17.1" into {"17"}, not {"17", "1"}. The problem is in method tokenAcronymProcessing, line 544

if (token.length() - s > 1) { // should change to token.length() - s > 0
String subtoken = token.substring(s);
addToken(subtoken, start + s, end);
}

token.length() - s > 1 will miss single-character tokens

Discussion

  • Weize Kong

    Weize Kong - 2014-04-28
    • Group: v1.x --> v5.x
     
  • David Fisher

    David Fisher - 2014-04-30
    • assigned_to: Weize Kong
     
  • David Fisher

    David Fisher - 2014-06-10
    • status: open --> accepted
     
  • David Fisher

    David Fisher - 2014-06-10

    to ship in 6/2014 release.

     
  • David Fisher

    David Fisher - 2014-06-19
    • Status: accepted --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB