From: Alan J. S. <sa...@gm...> - 2005-10-02 22:14:48
|
Hi. I'm using infomap 0.8.6 to build a matrix but nothing I do seems to work. I've formatted the King James' Bible using the doc and text tags thus: <doc> <text> The Old Testament of... [...missing text...] End of the Project Gutenberg Edition of the King James Bible </text> </doc> and tried the infomap-build command thus: :infomap-build -s /Users/alan/Projects/infomap/corpus/kjv10.txt test1 The output is below. However, when I run the associate command thus: :associate -t -c test1 God I get: :No word/document vector for "God". Or "GOD" or "god". I've tried this on OS X, Ubuntu and Mandrake Linux all with the same results. I've checked the KJ Bible to ensure that no errant tags are contained (they are not) and followed the examples in the user manual to the letter and still nothing. I've also tried changing the SVD_ITER value to 400 thus: :infomap-build -D SVD_ITER=3D400 -s /Users/alan/Projects/infomap/corpus/kjv10.txt test1 which took the new value into account, but again the associate command doesn't work. Can anyone shed any light on what is going wrong or on what I should be doing but don't yet know about? Rgds, Alan J. Salmoni. This is the output of the infomap-build command (with SVD-ITER changed to 4= 00): Sourcing param file "/usr/local/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.kyInQJ" Contents are: SVD_ITER=3D400 Removing extra param file WORKING_DATA_DIR =3D "/Users/alan/Projects/infomap/corpus/models/test1" CORPUS_DIR =3D "/Users/alan/Projects/infomap/corpus" CORPUS_FILE =3D "/Users/alan/Projects/infomap/corpus/kjv10.txt" FNAMES_FILE =3D "" ROWS =3D "20000" COLUMNS =3D "1000" SINGVALS =3D "100" SVD_ITER =3D "400" PRE_CONTEXT_SIZE =3D "15" POST_CONTEXT_SIZE =3D "15" WRITE_MATLAB_FORMAT =3D "0" VALID_CHARS_FILE =3D "/usr/local/share/infomap-nlp/valid_chars.en" STOPLIST_FILE =3D "/usr/local/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE =3D "0" COL_LABEL_FILE =3D "" echo "Making datadir" Making datadir mkdir -p /Users/alan/Projects/infomap/corpus/models/test1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/wordlist Prerequisites: /Users/alan/Projects/infomap/corpus/kjv10.txt Sat Oct 1 15:49:47 BST 2005 .................................................. prepare_corpus \ -cdir "/Users/alan/Projects/infomap/corpus" \ -mdir "/Users/alan/Projects/infomap/corpus/models/test1" \ -cfile "/Users/alan/Projects/infomap/corpus/kjv10.txt" \ -fnfile "" \ -chfile "/usr/local/share/infomap-nlp/valid_chars.en" \ -slfile "/usr/local/share/infomap-nlp/stop.list" \ -rptfile "" Locale set to en_US. Opening File for "r": "/usr/local/share/infomap-nlp/valid_chars.en" Opening File for "r": "" my_fopen: No such file or directory Opening File for "r": "/usr/local/share/infomap-nlp/stop.list" Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/wordlist" Opening File for "r": "/Users/alan/Projects/infomap/corpus/kjv10.txt" Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/numDocs" Typecount =3D 0 Preparing to sort ... Sorting ... Done. Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/dic" .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/wordlist =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/coll Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/wordlist /Users/alan/Projects/infomap/corpus/models/test1/dic /Users/alan/Projects/infomap/corpus/models/test1/numDocs Sat Oct 1 15:49:48 BST 2005 .................................................. count_wordvec \ -mdir /Users/alan/Projects/infomap/corpus/models/test1 \ -matlab 0 \ -precontext 15 \ -postcontext 15 \ -rows 20000 \ -columns 1000 \ -col_labels_from_file 0 \ -col_label_file "" model data dir is "/Users/alan/Projects/infomap/corpus/models/test1". count_wordvec.c: looking for 0 rows which had better match 0 Reading the dictionary... Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/dic" Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/numDocs" Initializing row indices...Done. Initializing column indices...Done. Allocating matrix memory...done. Initializing matrix...done. model data dir is "/Users/alan/Projects/infomap/corpus/models/test1". count_wordvec.c: about to call process_wordlist Entering process_wordlist. About to call initialize_wordlist. Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/wordlist" Returned from initialize_wordlist. Writing the co-occurrence matrix. Entering write_matrix_svd; rows =3D 0 and columns =3D 1000. Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/coll" Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/indx" .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/coll =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/left Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/coll /Users/alan/Projects/infomap/corpus/models/test1/indx Sat Oct 1 15:49:48 BST 2005 .................................................. cd /Users/alan/Projects/infomap/corpus/models/test1 && rm -f svd_diag left = \ rght sing cd /Users/alan/Projects/infomap/corpus/models/test1 && svdinterface \ -singvals 100 \ -iter 400 This is svdinterface. Writing to: left Writing to: rght Writing to: sing Writing to: svd_diag Reading: indx Reading: indx Reading: coll FEWER THAN EXPECTED SINGULAR VALUES .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/left =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.b= in Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/left /Users/alan/Projects/infomap/corpus/models/test1/dic Sat Oct 1 15:49:49 BST 2005 .................................................. encode_wordvec \ -m /Users/alan/Projects/infomap/corpus/models/test1 Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/left" Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin" Reading the dictionary... Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/dic" Initializing row indices...Done. .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.= bin =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/artvec.bi= n Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin /Users/alan/Projects/infomap/corpus/models/test1/wordlist /Users/alan/Projects/infomap/corpus/models/test1/dic /Users/alan/Projects/infomap/corpus/models/test1/numDocs Sat Oct 1 15:49:49 BST 2005 .................................................. count_artvec -m /Users/alan/Projects/infomap/corpus/models/test1 Opening File for "w": "/Users/alan/Projects/infomap/corpus/models/test1/artvec.bin" Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/numDocs" Reading the dictionary... Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/dic" Initializing row indices...Done. Allocating matrix memory...done. Initializing matrix...done. Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin" count_artvec.c: about to read 0 rows from wordvector file. Entering process_wordlist. About to call initialize_wordlist. Opening File for "r": "/Users/alan/Projects/infomap/corpus/models/test1/wordlist" Returned from initialize_wordlist. .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/artvec.b= in =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Building target: /Users/alan/Projects/infomap/corpus/models/test1/model_params.txt Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/model_param= s.bin /Users/alan/Projects/infomap/corpus/models/test1/model_info.bin /Users/alan/Projects/infomap/corpus/models/test1/corpus_format.bin Sat Oct 1 15:49:49 BST 2005 .................................................. write_text_params -mdir /Users/alan/Projects/infomap/corpus/models/test1 .................................................. Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/model_params.txt =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D |