From: Beate D. <do...@IM...> - 2005-02-07 11:43:00
|
Hi Mich, > - Using the -n tag, one can extend the number of neighbours of > associate's output to 200. Is this a hard limit or could this > be altered to the theoretical maximum that is near the vocabulary size? There is no limit to the number of neighbors. You'll simply have to change MAX_NEIGHBOR and PRINT_NEIGHBOR in associate.h to a bigger number and use associate -n <big_number>. > Another, unrelated question that should be easily answered then. > If I recall correctly, some words in infomap are not taken into account, > for example, the word 'the' does not hold semantic content and is > therefore not taken into consideration in computing correspondences. Am > I right there? Since I am trying to get this program to work using Dutch > documents, I was wondering which info-map file holds the words that are > ignored in the computation, so it would be more easy to adapt it to the > current languange. It's "stop.list" in the admin directory which contains the words to be disregarded. If you want to use a different set of stopwords, you can simply set STOPLIST_FILE in "default-params.in" to point to a different stoplist, the format of which should be such that each line contains exactly one stopword. Best wishes, Beate |