WordFeatures class has a dictionary(WordsInTrain) which
uses a tokenGenerator to tokenize the input data for
building the dictionary.
But while firing features, this tokenization is not
done in the WordFeatures class. If the user is using
some custom tokenizer instead of Model.TokenGenerator,
then the WordFeatures will not get the correct string.
Example of possible scenario
Assume there is a tokenGenerator that converts the
special character in input string to a special token
dict = new WordsInTrain(new
While firing features, the tokens will not get
transformed and thus feature generation will not be
Log in to post a comment.