Thread: [infomap-nlp-users] missing words

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I built an Infomap model using a very large corpus of newspaper articles
(100+ million words). I can use associate to query words, but I find that
some words that were contained in the corpus and were NOT stopwords are for
some reason not contained in the model, i.e. I get a response of "no word
vector for X." Is there some frequency threshold set? For example,
"falklands" doesn't appear in the model even though it appeared more than
500 times in the corpus.
 If there is some threshold, can I turn it off?
Thanks,
Gabriel Murray