|
From: Alan J. S. <sa...@gm...> - 2005-10-02 22:14:48
|
Hi.
I'm using infomap 0.8.6 to build a matrix but nothing I do seems to work.
I've formatted the King James' Bible using the doc and text tags thus:
<doc>
<text>
The Old Testament of...
[...missing text...]
End of the Project Gutenberg Edition of the King James Bible
</text>
</doc>
and tried the infomap-build command thus:
:infomap-build -s /Users/alan/Projects/infomap/corpus/kjv10.txt test1
The output is below. However, when I run the associate command thus:
:associate -t -c test1 God
I get:
:No word/document vector for "God".
Or "GOD" or "god". I've tried this on OS X, Ubuntu and Mandrake Linux
all with the same results. I've checked the KJ Bible to ensure that no
errant tags are contained (they are not) and followed the examples in
the user manual to the letter and still nothing. I've also tried
changing the SVD_ITER value to 400 thus:
:infomap-build -D SVD_ITER=3D400 -s
/Users/alan/Projects/infomap/corpus/kjv10.txt test1
which took the new value into account, but again the associate command
doesn't work. Can anyone shed any light on what is going wrong or on
what I should be doing but don't yet know about?
Rgds,
Alan J. Salmoni.
This is the output of the infomap-build command (with SVD-ITER changed to 4=
00):
Sourcing param file "/usr/local/share/infomap-nlp/default-params"
Sourcing extra param file "/tmp/infomap-build.kyInQJ"
Contents are:
SVD_ITER=3D400
Removing extra param file
WORKING_DATA_DIR =3D "/Users/alan/Projects/infomap/corpus/models/test1"
CORPUS_DIR =3D "/Users/alan/Projects/infomap/corpus"
CORPUS_FILE =3D "/Users/alan/Projects/infomap/corpus/kjv10.txt"
FNAMES_FILE =3D ""
ROWS =3D "20000"
COLUMNS =3D "1000"
SINGVALS =3D "100"
SVD_ITER =3D "400"
PRE_CONTEXT_SIZE =3D "15"
POST_CONTEXT_SIZE =3D "15"
WRITE_MATLAB_FORMAT =3D "0"
VALID_CHARS_FILE =3D "/usr/local/share/infomap-nlp/valid_chars.en"
STOPLIST_FILE =3D "/usr/local/share/infomap-nlp/stop.list"
COL_LABELS_FROM_FILE =3D "0"
COL_LABEL_FILE =3D ""
echo "Making datadir"
Making datadir
mkdir -p /Users/alan/Projects/infomap/corpus/models/test1
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target: /Users/alan/Projects/infomap/corpus/models/test1/wordlist
Prerequisites: /Users/alan/Projects/infomap/corpus/kjv10.txt
Sat Oct 1 15:49:47 BST 2005
..................................................
prepare_corpus \
-cdir "/Users/alan/Projects/infomap/corpus" \
-mdir "/Users/alan/Projects/infomap/corpus/models/test1" \
-cfile "/Users/alan/Projects/infomap/corpus/kjv10.txt" \
-fnfile "" \
-chfile "/usr/local/share/infomap-nlp/valid_chars.en" \
-slfile "/usr/local/share/infomap-nlp/stop.list" \
-rptfile ""
Locale set to en_US.
Opening File for "r":
"/usr/local/share/infomap-nlp/valid_chars.en"
Opening File for "r":
""
my_fopen: No such file or directory
Opening File for "r":
"/usr/local/share/infomap-nlp/stop.list"
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/wordlist"
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/kjv10.txt"
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/numDocs"
Typecount =3D 0
Preparing to sort ... Sorting ... Done.
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/dic"
..................................................
Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/wordlist
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target: /Users/alan/Projects/infomap/corpus/models/test1/coll
Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/wordlist
/Users/alan/Projects/infomap/corpus/models/test1/dic
/Users/alan/Projects/infomap/corpus/models/test1/numDocs
Sat Oct 1 15:49:48 BST 2005
..................................................
count_wordvec \
-mdir /Users/alan/Projects/infomap/corpus/models/test1 \
-matlab 0 \
-precontext 15 \
-postcontext 15 \
-rows 20000 \
-columns 1000 \
-col_labels_from_file 0 \
-col_label_file ""
model data dir is "/Users/alan/Projects/infomap/corpus/models/test1".
count_wordvec.c: looking for 0 rows
which had better match 0
Reading the dictionary... Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/dic"
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/numDocs"
Initializing row indices...Done.
Initializing column indices...Done.
Allocating matrix memory...done.
Initializing matrix...done.
model data dir is "/Users/alan/Projects/infomap/corpus/models/test1".
count_wordvec.c: about to call process_wordlist
Entering process_wordlist.
About to call initialize_wordlist.
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/wordlist"
Returned from initialize_wordlist.
Writing the co-occurrence matrix.
Entering write_matrix_svd; rows =3D 0 and columns =3D 1000.
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/coll"
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/indx"
..................................................
Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/coll
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target: /Users/alan/Projects/infomap/corpus/models/test1/left
Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/coll
/Users/alan/Projects/infomap/corpus/models/test1/indx
Sat Oct 1 15:49:48 BST 2005
..................................................
cd /Users/alan/Projects/infomap/corpus/models/test1 && rm -f svd_diag left =
\
rght sing
cd /Users/alan/Projects/infomap/corpus/models/test1 && svdinterface \
-singvals 100 \
-iter 400
This is svdinterface.
Writing to: left
Writing to: rght
Writing to: sing
Writing to: svd_diag
Reading: indx
Reading: indx
Reading: coll
FEWER THAN EXPECTED SINGULAR VALUES
..................................................
Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/left
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.b=
in
Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/left
/Users/alan/Projects/infomap/corpus/models/test1/dic
Sat Oct 1 15:49:49 BST 2005
..................................................
encode_wordvec \
-m /Users/alan/Projects/infomap/corpus/models/test1
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/left"
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin"
Reading the dictionary...
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/dic"
Initializing row indices...Done.
..................................................
Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.=
bin
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target: /Users/alan/Projects/infomap/corpus/models/test1/artvec.bi=
n
Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin
/Users/alan/Projects/infomap/corpus/models/test1/wordlist
/Users/alan/Projects/infomap/corpus/models/test1/dic
/Users/alan/Projects/infomap/corpus/models/test1/numDocs
Sat Oct 1 15:49:49 BST 2005
..................................................
count_artvec -m /Users/alan/Projects/infomap/corpus/models/test1
Opening File for "w":
"/Users/alan/Projects/infomap/corpus/models/test1/artvec.bin"
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/numDocs"
Reading the dictionary... Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/dic"
Initializing row indices...Done.
Allocating matrix memory...done.
Initializing matrix...done.
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/wordvec.bin"
count_artvec.c: about to read 0 rows from wordvector file.
Entering process_wordlist.
About to call initialize_wordlist.
Opening File for "r":
"/Users/alan/Projects/infomap/corpus/models/test1/wordlist"
Returned from initialize_wordlist.
..................................................
Finishing target: /Users/alan/Projects/infomap/corpus/models/test1/artvec.b=
in
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Building target:
/Users/alan/Projects/infomap/corpus/models/test1/model_params.txt
Prerequisites: /Users/alan/Projects/infomap/corpus/models/test1/model_param=
s.bin
/Users/alan/Projects/infomap/corpus/models/test1/model_info.bin
/Users/alan/Projects/infomap/corpus/models/test1/corpus_format.bin
Sat Oct 1 15:49:49 BST 2005
..................................................
write_text_params -mdir /Users/alan/Projects/infomap/corpus/models/test1
..................................................
Finishing target:
/Users/alan/Projects/infomap/corpus/models/test1/model_params.txt
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
|