You can subscribe to this list here.
2004 |
Jan
(1) |
Feb
|
Mar
(8) |
Apr
(6) |
May
(6) |
Jun
(1) |
Jul
|
Aug
(2) |
Sep
|
Oct
(7) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(9) |
Feb
(13) |
Mar
(2) |
Apr
(2) |
May
(14) |
Jun
(9) |
Jul
|
Aug
(1) |
Sep
|
Oct
(15) |
Nov
(1) |
Dec
(1) |
2006 |
Jan
(2) |
Feb
(9) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(3) |
Jul
(8) |
Aug
(8) |
Sep
(3) |
Oct
(6) |
Nov
(4) |
Dec
|
2007 |
Jan
(6) |
Feb
(5) |
Mar
(5) |
Apr
(15) |
May
(4) |
Jun
|
Jul
(4) |
Aug
(20) |
Sep
(14) |
Oct
(7) |
Nov
(11) |
Dec
(5) |
2008 |
Jan
(4) |
Feb
(5) |
Mar
(34) |
Apr
(35) |
May
(10) |
Jun
(14) |
Jul
(35) |
Aug
(15) |
Sep
(17) |
Oct
(21) |
Nov
(43) |
Dec
(40) |
2009 |
Jan
(37) |
Feb
(22) |
Mar
(45) |
Apr
(48) |
May
(123) |
Jun
(103) |
Jul
(71) |
Aug
(25) |
Sep
(19) |
Oct
(42) |
Nov
(12) |
Dec
(22) |
2010 |
Jan
(12) |
Feb
(18) |
Mar
(39) |
Apr
(59) |
May
(67) |
Jun
(65) |
Jul
(37) |
Aug
(39) |
Sep
(20) |
Oct
(3) |
Nov
|
Dec
(1) |
2011 |
Jan
(3) |
Feb
(6) |
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(4) |
Jul
(7) |
Aug
(4) |
Sep
(4) |
Oct
|
Nov
(4) |
Dec
(1) |
2012 |
Jan
(4) |
Feb
(5) |
Mar
(3) |
Apr
(4) |
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(4) |
Dec
(1) |
2013 |
Jan
(7) |
Feb
(4) |
Mar
(2) |
Apr
(2) |
May
(3) |
Jun
(5) |
Jul
(9) |
Aug
(6) |
Sep
(6) |
Oct
(5) |
Nov
(7) |
Dec
(3) |
2014 |
Jan
(7) |
Feb
(4) |
Mar
(8) |
Apr
(7) |
May
(7) |
Jun
(2) |
Jul
(6) |
Aug
(4) |
Sep
(7) |
Oct
(9) |
Nov
(3) |
Dec
(1) |
2015 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <wc...@co...> - 2005-01-28 06:50:39
|
Hi. I've followed the compilation instructions in the Infomap User Manual, but have run into a problem with the svd software. Here are the error messages relating to malloc: Making all in svd make[2]: Entering directory `/home/wcarus/infomap-nlp-0.8.5/svd' Making all in svdinterface make[3]: Entering directory `/home/wcarus/infomap-nlp-0.8.5/svd/svdinterface' if gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I../../lib -I../../admin -g -O2 -MT svdinterface.o -MD -MP -MF ".deps/svdinterface.Tpo" -c -o svdinterface.o svdinterface.c; \ then mv -f ".deps/svdinterface.Tpo" ".deps/svdinterface.Po"; else rm -f ".deps/svdinterface.Tpo"; exit 1; fi if gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I../../lib -I../../admin -g -O2 -MT las2.o -MD -MP -MF ".deps/las2.Tpo" -c -o las2.o las2.c; \ then mv -f ".deps/las2.Tpo" ".deps/las2.Po"; else rm -f ".deps/las2.Tpo"; exit 1; fi if gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I../../lib -I../../admin -g -O2 -MT myutils.o -MD-MP -MF ".deps/myutils.Tpo" -c -o myutils.o myutils.c; \ then mv -f ".deps/myutils.Tpo" ".deps/myutils.Po"; else rm -f ".deps/myutils.Tpo"; exit 1; fi myutils.c: In function `mymalloc': myutils.c:167: error: conflicting types for 'malloc' make[3]: *** [myutils.o] Error 1 make[3]: Leaving directory `/home/wcarus/infomap-nlp-0.8.5/svd/svdinterface' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/home/wcarus/infomap-nlp-0.8.5/svd' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/wcarus/infomap-nlp-0.8.5' make: *** [all] Error 2 Any suggestions for correcting this error? Just in case this is important, here's some more information about the version of gcc I'm using: Reading specs from /usr/lib/gcc/i686-pc-linux-gnu/3.4.3/specs Configured with: ../gcc-3.4.3/configure --prefix=/usr --enable-shared --enable-languages=c,c++,objc --enable-threads=posix --enable-__cxa_atexit Thread model: posix gcc version 3.4.3 Regards, Win Carus |
From: <te...@me...> - 2005-01-04 11:03:11
|
METAFORUM NEWSLETTER |
From: Dominic W. <dwi...@cs...> - 2005-01-03 17:40:50
|
Hi Michiel, I think the software should already have the feature you request. There's a "-f" option and if you use associate -f vector_output_file .... it should print the results to the file named. Another hack I've regularly used in the past is just assocaite .... > filename which I'm pretty sure should work on Cygwin. Best wishes, Dominic On Mon, 3 Jan 2005, [iso-8859-1] Spap=E9, Michiel wrote: > Dear list, > > Following my email earlier, I have a bit of a feature request, which I > am willing to implement myself, but, to my misfortune, my knowledge of C > is rather... restricted. What I want to do, is using Associate to send > its nearest neighbour output (documents and words) to a text-file > instead of to the screen. My experience in other programming languages > tell me this shouldn't be much of a problem, so I hope someone can give > me a clear answer. > > Now, I could be wrong here, but I think, this is the part where the > output is printed: > > /* Print the result. */ > if ( !print_list( list, num_results_printed )) { > =09fprintf( stderr, "Can't print output."); > =09exit( EXIT_FAILURE);=09} > > What do I need to change in order to get my results written to a > textfile? If someone with a bit of C-knowledge could help, I'd greatly > appreciate it. > > Cheers, > > Michiel Spap=E9 > > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > > This footnote also confirms that this email message has been swept by > the mailserver for the presence of computer viruses and executables. > > ********************************************************************** > > |
From: manish m. <man...@ya...> - 2005-01-03 13:52:57
|
Hi I'd like to know the complexity of the algorithm. I had implemented the LSI algorithm with SVDPACKC and in Java with the Colt library. But the complexity of O(n^2) prevented any scale-up notions I had. It would help me if anyone can provide me any leads to optimizations done before I start going through the code. thanks and regards manish mishra __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail |
From: <MS...@FS...> - 2005-01-03 12:10:35
|
Dear list, Following my email earlier, I have a bit of a feature request, which I am w= illing to implement myself, but, to my misfortune, my knowledge of C is rat= her... restricted. What I want to do, is using Associate to send its neares= t neighbour output (documents and words) to a text-file instead of to the s= creen. My experience in other programming languages tell me this shouldn't = be much of a problem, so I hope someone can give me a clear answer. Now, I could be wrong here, but I think, this is the part where the output = is printed: /* Print the result. */ if ( !print_list( list, num_results_printed )) { fprintf( stderr, "Can't print output."); exit( EXIT_FAILURE); } What do I need to change in order to get my results written to a textfile? = If someone with a bit of C-knowledge could help, I'd greatly appreciate it. Cheers, Michiel Spap=E9 ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept by the mailserver for the presence of computer viruses and executables. ********************************************************************** |
From: <MS...@FS...> - 2005-01-03 11:57:40
|
Mike, Have you solved your problem? Looking at the archives, I recognized the pro= blem with count_artvec as one that bugged me for a while as well. I am - al= as - also working under cygwin, which I try to minimize as much as possible= . Assuming that you want the same, maybe this helps: 1. to DOS via the command prompt 2. once there, type: SET INFOMAP_WORKING_DIR=3D/home SET INFOMAP_MODEL_PATH=3D/home 3. go to the cygwin\bin directory and type bash --login -i infomap-build -s /home/corpustxt.txt=20 That is, assuming you have your single corpusfile named corpustxt.txt and h= ave placed it, like everything else, in the home directory. Don't ask me why, but this seems to work. I hope it will as well for you, a= nd don't have to spend ages on reinstalling cygwin and so on. Cheers, Michiel Spap=E9 ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept by the mailserver for the presence of computer viruses and executables. ********************************************************************** |
From: Mike D. <M.D...@dc...> - 2004-11-22 16:28:06
|
I can't get infomap to build a model succesfully. I'm running it under cygwin, and I get the following output. (When I call the programs individually rather than through the infomap-build makefile I get a STATUS_ACCCESS_VIOLATION error reported.) All the programs up to count_artvec seem to work fine./ / Any idea what's going wrong? cheers, Mike / mike@GRINDLEFORD /cygdrive/z/infomap-nlp-0.8.5 $ infomap-build -s working/single.txt reallynew Sourcing param file "/usr/local/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.Nk2192" Contents are: Removing extra param file WORKING_DATA_DIR = "/tmp/mike/infomap_working_dir/reallynew" CORPUS_DIR = "working" CORPUS_FILE = "working/single.txt" FNAMES_FILE = "" ROWS = "20000" COLUMNS = "1000" SINGVALS = "100" SVD_ITER = "100" PRE_CONTEXT_SIZE = "15" POST_CONTEXT_SIZE = "15" WRITE_MATLAB_FORMAT = "0" VALID_CHARS_FILE = "/usr/local/share/infomap-nlp/valid_chars.en" STOPLIST_FILE = "/usr/local/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE = "0" COL_LABEL_FILE = "" echo "Making datadir" Making datadir mkdir -p /tmp/mike/infomap_working_dir/reallynew ================================================== Building target: /tmp/mike/infomap_working_dir/reallynew/wordlist Prerequisites: working/single.txt Mon Nov 22 16:11:40 GMTST 2004 .................................................. prepare_corpus \ -cdir "working" \ -mdir "/tmp/mike/infomap_working_dir/reallynew" \ -cfile "working/single.txt" \ -fnfile "" \ -chfile "/usr/local/share/infomap-nlp/valid_chars.en" \ -slfile "/usr/local/share/infomap-nlp/stop.list" \ -rptfile "" Locale set to (null). Opening File for "r": "/usr/local/share/infomap-nlp/valid_chars.en" Opening File for "r": "" my_fopen: No such file or directory Opening File for "r": "/usr/local/share/infomap-nlp/stop.list" Opening File for "w": "/tmp/mike/infomap_working_dir/reallynew/wordlist" Opening File for "r": "working/single.txt" Opening File for "w": "/tmp/mike/infomap_working_dir/reallynew/numDocs" Typecount = 233 Preparing to sort ... Sorting ... Done. Opening File for "w": "/tmp/mike/infomap_working_dir/reallynew/dic" .................................................. Finishing target: /tmp/mike/infomap_working_dir/reallynew/wordlist ================================================== ================================================== Building target: /tmp/mike/infomap_working_dir/reallynew/coll Prerequisites: /tmp/mike/infomap_working_dir/reallynew/wordlist /tmp/mike/infoma p_working_dir/reallynew/dic /tmp/mike/infomap_working_dir/reallynew/numDocs Mon Nov 22 16:11:41 GMTST 2004 .................................................. count_wordvec \ -mdir /tmp/mike/infomap_working_dir/reallynew \ -matlab 0 \ -precontext 15 \ -postcontext 15 \ -rows 20000 \ -columns 1000 \ -col_labels_from_file 0 \ -col_label_file "" model data dir is "/tmp/mike/infomap_working_dir/reallynew". count_wordvec.c: looking for 233 rows which had better match 233 Reading the dictionary... Opening File for "r": "/tmp/mike/infomap_working_dir/reallynew/dic" Opening File for "r": "/tmp/mike/infomap_working_dir/reallynew/numDocs" Initializing row indices...Done. Initializing column indices...Done. Allocating matrix memory...done. Initializing matrix...done. model data dir is "/tmp/mike/infomap_working_dir/reallynew". count_wordvec.c: about to call process_wordlist Entering process_wordlist. About to call initialize_wordlist. Opening File for "r": "/tmp/mike/infomap_working_dir/reallynew/wordlist" Returned from initialize_wordlist. Signal 11 make: *** [/tmp/mike/infomap_working_dir/reallynew/coll] Error 139 / |
From: Mich <pse...@zo...> - 2004-10-28 15:42:51
|
Hello Beate, Thursday, October 28, 2004, 1:27:41 PM, you wrote: > Hi Mich, > The "-i" option of associate has to be followed by either "w" (words) or > "d" (documents). So the correct way to call associate when using a > document as a query is: > associate -d -w <model_dir> -c <corpus> -i d <doc_id> > where doc_id is the document identifier of one of the documents in the > corpus. Please note that the option "-i d" works only, if the document > <doc_id> is part of the corpus which you used for building the model, and > then doc_id has to be the offset of the document in the corpus (the number > inclosed in <f></f> in the "wordlist" file). > As far as I understand, you would however like to compare text files to > the corpus which are not part of the corpus itself. Of course, it should be possible to temporarily create a new model, and then compare the latest addition to the other documents, right? I wouldn't know how to automate this process though, so that wouldn't really be useful (unless this new addition would always be the last (or 'highest' maybe even?) identifier in the wordlist file). When going for some sort of automatic grading, this would mean i'd make a model of doc_id_1-doc_id_i - which would all be 'relevant, maybe even perfect' answers to the question asked - and a single student's answer to the question - doc_id_i+1 - included. After that, the 'estimated grade' would be a function of the similarities between doc_id_i+1 and the rest. I am able to program something like this in pascal (yes, some people still use that!), IF i could somehow make an executable of infomap_build (associate.exe seems to work...). > In this case, as Dominic suggested, you can simply hand > over the complete text stream to associate, i.e. This is maybe easier, though i would have to know how to treat the text. Which words should be left out (such as 'the'), etc? Maybe you can help me with this, Dominic? > associate -d w <model_dir> -c <corpus> -i w <word1> <word2> <word3> ... > where the <wordi> form the text stream of your query file. Thanks, that should keep me from typing it wrong several times in a row! Cheers, Mich |
From: Beate D. <do...@IM...> - 2004-10-28 11:27:46
|
Hi Mich, The "-i" option of associate has to be followed by either "w" (words) or "d" (documents). So the correct way to call associate when using a document as a query is: associate -d -w <model_dir> -c <corpus> -i d <doc_id> where doc_id is the document identifier of one of the documents in the corpus. Please note that the option "-i d" works only, if the document <doc_id> is part of the corpus which you used for building the model, and then doc_id has to be the offset of the document in the corpus (the number inclosed in <f></f> in the "wordlist" file). As far as I understand, you would however like to compare text files to the corpus which are not part of the corpus itself. In this case, as Dominic suggested, you can simply hand over the complete text stream to associate, i.e. associate -d w <model_dir> -c <corpus> -i w <word1> <word2> <word3> ... where the <wordi> form the text stream of your query file. Best, Beate On Wed, 27 Oct 2004, Mich wrote: > Hello Beate, Dominic, > > Thank you both for your kind and helpful answers. I've tried what you > suggest, but it seems the associate command disagrees with me. I have > a single-file corpus - typically called 'many' and have another > document (text-file in the directory), called 'aow' (from sun tzu's > art of war). Since the model already works ("associate -t -c many war" > returns entries such as 'barbaric' and 'pointless' as highly similar), > i was wondering what it could be. > > When I try > "associate -d -c many -i aow.txt" > In which i would have wished the program returned a list of similarity > indices between the text-file and the corpus. Since aow.txt is a > document within the corpus, ideally, the program would return an index > of 1.000 with one document in the corpus. If i would guess, Homer's > Odyssee would be high also, as i would see it as remotely similar > (they are both related to war). > > Anyway, when i try this in whatever order, i keep getting as output: > "Bad option: -i" > > When I just leave -i out, it returns > Am I using a different version (i think i have the latest)? If you > could help me, could you please state the syntax using the example i > just gave? Thank you. > > Dominic: thanks for your suggestions as well. If I could have this > program in a windows environment, I could easily program something > as you suggested, but although i often wish otherwise, i am not a > programmer. And what experience i do have, is totally unrelated to > unix environments. However, it seems that cygwin compiles a few .exe binary > files, among which, associate. Do you think it is possible to use > these from within windows (cmd)? Is there a particular reason no > binaries of infomap-nlp were included? > > Hopefully i haven't offended anyone with bring microsoft software to > this discussion! > > Cheers, > > Mich > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > infomap-nlp-users mailing list > inf...@li... > https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |
From: Beate D. <do...@IM...> - 2004-10-28 08:50:25
|
Hi Mich, To answer your second question first: Do you get the same error message when you list the absolute pathnames of your corpus files (relative to the root directory) in the file of filenames? If you also use the absolute pathname of the reference file in the "-m" option, infomap-build should be able to locate the files. The format of your reference file is just fine, it's one file name per line. Let me know if you are still having problems building the model. Good luck, Beate On Thu, 28 Oct 2004, Mich wrote: > While I'm - actually again - at it, could i ask a very basic question? > What exactly is the format for a multiple file list of corpora? Right > now, i have a small amount of .txt files in a single directory, and > according to the manual, another file should point towards the other > files. I have done this by numbering the txt filenames as 1.txt 2.txt > 3.txt etc, and have made a 'reference file' with all these filenames > underneath another: > 1.txt > 2.txt > 3.txt > [..] > 19.txt > etc. > > However, infomap-build doesn't seem to recognize this, stopping with > 'can't open current corpus file' > and > 'make *** [/home/jrandom/infomap-models/ned/wordlist] Error 1' > > so, i figured the text file probably has a different format from what > i thought it would be. Actually, i had to guess, since the manual > states that > > "In a multiple-file corpus, each disk file that is part of the corpus > must contain exactly one document. No tags are used; the entire > contents of the file are considered to make up the text of the > document and are processed by the Infomap software." > > which really leaves me puzzled as to the exact specifications of the > reference file. I have tried several alterations in my reference file, > but infomap-build seems either to stop with mentioned error message, > or continue and treat the reference file as a single corpus anyway. If > someone would send an example of a multifile reference-file, i would > be most pleased (as the one in the documentation seems lacking). > > Thank you kindly > > Mich > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > infomap-nlp-users mailing list > inf...@li... > https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |
From: Mich <pse...@zo...> - 2004-10-28 02:25:45
|
While I'm - actually again - at it, could i ask a very basic question? What exactly is the format for a multiple file list of corpora? Right now, i have a small amount of .txt files in a single directory, and according to the manual, another file should point towards the other files. I have done this by numbering the txt filenames as 1.txt 2.txt 3.txt etc, and have made a 'reference file' with all these filenames underneath another: 1.txt 2.txt 3.txt [..] 19.txt etc. However, infomap-build doesn't seem to recognize this, stopping with 'can't open current corpus file' and 'make *** [/home/jrandom/infomap-models/ned/wordlist] Error 1' so, i figured the text file probably has a different format from what i thought it would be. Actually, i had to guess, since the manual states that "In a multiple-file corpus, each disk file that is part of the corpus must contain exactly one document. No tags are used; the entire contents of the file are considered to make up the text of the document and are processed by the Infomap software." which really leaves me puzzled as to the exact specifications of the reference file. I have tried several alterations in my reference file, but infomap-build seems either to stop with mentioned error message, or continue and treat the reference file as a single corpus anyway. If someone would send an example of a multifile reference-file, i would be most pleased (as the one in the documentation seems lacking). Thank you kindly Mich |
From: Mich <pse...@zo...> - 2004-10-27 19:27:37
|
Hello Beate, Dominic, Thank you both for your kind and helpful answers. I've tried what you suggest, but it seems the associate command disagrees with me. I have a single-file corpus - typically called 'many' and have another document (text-file in the directory), called 'aow' (from sun tzu's art of war). Since the model already works ("associate -t -c many war" returns entries such as 'barbaric' and 'pointless' as highly similar), i was wondering what it could be. When I try "associate -d -c many -i aow.txt" In which i would have wished the program returned a list of similarity indices between the text-file and the corpus. Since aow.txt is a document within the corpus, ideally, the program would return an index of 1.000 with one document in the corpus. If i would guess, Homer's Odyssee would be high also, as i would see it as remotely similar (they are both related to war). Anyway, when i try this in whatever order, i keep getting as output: "Bad option: -i" When I just leave -i out, it returns Am I using a different version (i think i have the latest)? If you could help me, could you please state the syntax using the example i just gave? Thank you. Dominic: thanks for your suggestions as well. If I could have this program in a windows environment, I could easily program something as you suggested, but although i often wish otherwise, i am not a programmer. And what experience i do have, is totally unrelated to unix environments. However, it seems that cygwin compiles a few .exe binary files, among which, associate. Do you think it is possible to use these from within windows (cmd)? Is there a particular reason no binaries of infomap-nlp were included? Hopefully i haven't offended anyone with bring microsoft software to this discussion! Cheers, Mich |
From: Beate D. <do...@IM...> - 2004-10-27 16:28:28
|
Hi! > My question, however, concerns if it > would be possible to do document to document comparisons. Could somebody > here provide me with info on this? To answer Mich's question: It is already possible to use documents as queries. By specifying the option "-i d" (for "input is document"), "associate" expects document identifiers rather than words as query "terms". In the case of a single-file corpus, document identifiers are the offsets in the corpus (the numbers enclosed in <f> and </f> in the wordlist file), and in case of a multiple-file corpus, in which each file constitutes a document, the identifiers are the names of the files. E.g., "associate -d -i d ... <doc_1> .. <doc_k> NOT <doc_k+1> .. <doc_k+n>" will return documents which are similar to doc_1 .. doc_k and dissimilar to doc_k+1 .. doc_k+n, where the doc_i are the identifiers of the corresponding documents. To actually compare two documents, you can use "associate -q -i d ..." for each of the two documents to obtain their vector representations, and then simply compute the scalar product. Best, Beate |
From: Mich <pse...@zo...> - 2004-10-26 12:52:15
|
Greetings all, First of all, let me express my gratitude about your work, that is infomap-nlp and the fact that you let others use it. One of them being me, who, in search of alternatives to various 'automatic essay grading software' stumbled upon LSA and ultimately Infomap-nlp. It took me a couple of days to get this all running in a windows environment (using cygwin), which to you may be an indication of how much i really know about GNU, Unix, etc -- nada. Anyway, despite the setbacks, i've worked my way through the manual, built a few models, and are able to use both the associate word and associate document commands. My question, however, concerns if it would be possible to document to document comparisons. Could somebody here provide me with info on this? If you are interested in knowing why i would want to do this, please read on. I am currently working for the department of education-studies, at the University of Leiden (Netherlands). As said, we would be interested in 'automatic grading', what teacher wouldn't? Of course, it's not as simple as some companies have suggested, or at least, i doubt it would be. However, experimenting with freely available software under the GNU license won't do harm. On that basis, we have actually three goals: - automatic grading, which would ideally give something like a grade. However, using infomap-nlp it would, theoretically possible, to compare students' essays to a corpus of relevant information, thus giving at least some indication of how 'well' they performed. Also, much like in psychology, pedagogics/education suffers from a lack of justifiable scales, so, comparisons such as this might also be good - in the sense that they would be objective - indicators of cognitive ability. - diminishing plagiarism, which would be possible if i could compare each student's essay to a corpus of their peers - retrieving information about the student, from what they write. You could imagine that the similarity between an essay and various emotional-coloured corpura (such as poetry) would make it possible to develop additional 'emotional scales'. I hope you find it worthwhile to help me in these three quests. Cheers, Mich |
From: Dominic W. <dwi...@cs...> - 2004-08-08 17:02:06
|
Hi Norman, What operating system are you trying to install the infomap software on? (I'm a bit surprised, this isn't something I've known to be missing before, it would be good to track it down though I'm not sure how much help I'll be.) Best wishes, Dominic > hi.. > > I try to install infomap-nlp-0.8.5, > but it always tells me that I miss "GNU-compatible malloc()" and "ANSI C headers file" > however, I am sure that I have installed "glib-devel" and "GNU malloc. > > what should I do now? > thank you!! > > Norman |
From: Norman <rg...@ii...> - 2004-08-08 07:29:23
|
hi.. I try to install infomap-nlp-0.8.5, but it always tells me that I miss "GNU-compatible malloc()" and "ANSI C = headers file" however, I am sure that I have installed "glib-devel" and "GNU malloc. what should I do now? thank you!! Norman |
From: Harri S. <har...@he...> - 2004-06-03 07:43:26
|
Hello, I got this error below from running infomap-models command. It produces empty coll and indx files, hence associate does not work. I think my input XML is correct, attached you find part of it. bw, Harri S * Sourcing param file "/home/h/m/hmtsaari/Programs/infomap/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.YedMNd" Contents are: Removing extra param file WORKING_DATA_DIR = "/tmp/hmtsaari/infomap_working_dir/text_output_matrix" CORPUS_DIR = "/home/h/m/hmtsaari/Programs/infomap/Corpus" CORPUS_FILE = "/home/h/m/hmtsaari/Programs/infomap/Corpus/text.txt" FNAMES_FILE = "" ROWS = "20000" COLUMNS = "1000" SINGVALS = "100" SVD_ITER = "100" PRE_CONTEXT_SIZE = "15" POST_CONTEXT_SIZE = "15" WRITE_MATLAB_FORMAT = "0" VALID_CHARS_FILE = "/home/h/m/hmtsaari/Programs/infomap/share/infomap-nlp/valid_chars.en" STOPLIST_FILE = "/home/h/m/hmtsaari/Programs/infomap/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE = "0" COL_LABEL_FILE = "" echo "Making datadir" Making datadir mkdir -p /tmp/hmtsaari/infomap_working_dir/text_output_matrix ================================================== Building target: /tmp/hmtsaari/infomap_working_dir/text_output_matrix/left Prerequisites: /tmp/hmtsaari/infomap_working_dir/text_output_matrix/coll /tmp/hmtsaari/infomap_working_dir/text_output_matrix/indx Wed Jun 2 13:48:56 EEST 2004 .................................................. cd /tmp/hmtsaari/infomap_working_dir/text_output_matrix && rm svd_diag left \ rght sing cd /tmp/hmtsaari/infomap_working_dir/text_output_matrix && svdinterface \ -singvals 100 \ -iter 100 *********************** *************************** Harri M. T. Saarikoski Language Technologist | Doctorate Student AAC Global (R & D) | Helsinki University (Language Tech) phone +358 40 705 85 41 e-mail har...@he... |
From: <ben...@id...> - 2004-05-25 09:52:17
|
Dear Open Source developer I am doing a research project on "Fun and Software Development" in which I kindly invite you to participate. You will find the online survey under http://fasd.ethz.ch/qsf/. The questionnaire consists of 53 questions and you will need about 15 minutes to complete it. With the FASD project (Fun and Software Development) we want to define the motivational significance of fun when software developers decide to engage in Open Source projects. What is special about our research project is that a similar survey is planned with software developers in commercial firms. This procedure allows the immediate comparison between the involved individuals and the conditions of production of these two development models. Thus we hope to obtain substantial new insights to the phenomenon of Open Source Development. With many thanks for your participation, Benno Luthiger PS: The results of the survey will be published under http://www.isu.unizh.ch/fuehrung/blprojects/FASD/. We have set up the mailing list fa...@we... for this study. Please see http://fasd.ethz.ch/qsf/mailinglist_en.html for registration to this mailing list. _______________________________________________________________________ Benno Luthiger Swiss Federal Institute of Technology Zurich 8092 Zurich Mail: benno.luthiger(at)id.ethz.ch _______________________________________________________________________ |
From: Mathias P. <Mat...@vi...> - 2004-05-19 08:53:20
|
Greetings everybody, do you have any idea what the following errors might mean? infomap-0.8.5 compiled cleanly on FreeBSD-current, using either the system dbm or gdbm. The only change nessecary was to make "make ..." into gmake .... " in infomap-build, Both version show the same errormessages, one pointing to dbm. That's why I tried it with the system dbm and gdbm, without success the version used for these test uses gdbm: venus% ldd /usr/local/bin/prepare_corpus /usr/local/bin/prepare_corpus: libgdbm.so.3 => /usr/local/lib/libgdbm.so.3 (0x28079000) libm.so.2 => /lib/libm.so.2 (0x2807f000) libc.so.5 => /lib/libc.so.5 (0x28098000) Any ideas would be very welcome! Thanks, Mathias P.S.: here come the two error messages. if I run infomap -m again after infomap -s, I get the same error as infomap -s instead of the one shown here for infomap -m. 1) first try with infomap -m: venus% export INFOMAP_WORKING_DIRECTORY=/tmp venus% rm /tmp/gb/* zsh: sure you want to delete all the files in /tmp/gb [yn]? y venus% infomap-build -m gb-corpus.txt gb Sourcing param file "/usr/local/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.JDWu66" Contents are: Removing extra param file WORKING_DATA_DIR = "/tmp/gb" CORPUS_DIR = "." CORPUS_FILE = "" FNAMES_FILE = "gb-corpus.txt" ROWS = "20000" COLUMNS = "1000" SINGVALS = "100" SVD_ITER = "100" PRE_CONTEXT_SIZE = "15" POST_CONTEXT_SIZE = "15" WRITE_MATLAB_FORMAT = "0" VALID_CHARS_FILE = "/usr/local/share/infomap-nlp/valid_chars.en" STOPLIST_FILE = "/usr/local/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE = "0" COL_LABEL_FILE = "" echo "Making datadir" Making datadir mkdir -p /tmp/gb ================================================== Building target: /tmp/gb/wordlist Prerequisites: gb-corpus.txt Wed May 19 10:53:49 CEST 2004 .................................................. prepare_corpus \ -cdir "." \ -mdir "/tmp/gb" \ -cfile "" \ -fnfile "gb-corpus.txt" \ -chfile "/usr/local/share/infomap-nlp/valid_chars.en" \ -slfile "/usr/local/share/infomap-nlp/stop.list" \ -rptfile "" Locale set to (null). Opening File for "r": "/usr/local/share/infomap-nlp/valid_chars.en" Opening File for "r": "gb-corpus.txt" Opening File for "r": "/usr/local/share/infomap-nlp/stop.list" Opening File for "w": "/tmp/gb/wordlist" Can't open nu2na database gmake: *** [/tmp/gb/wordlist] Error 1 venus% ------------------------ next error msg ------------------------------------------------------- 2) infomap -s shows this:venus% infomap-build -s 1.html gb Sourcing param file "/usr/local/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.kDVdtJ" Contents are: Removing extra param file WORKING_DATA_DIR = "/tmp/gb" CORPUS_DIR = "." CORPUS_FILE = "1.html" FNAMES_FILE = "" ROWS = "20000" COLUMNS = "1000" SINGVALS = "100" SVD_ITER = "100" PRE_CONTEXT_SIZE = "15" POST_CONTEXT_SIZE = "15" WRITE_MATLAB_FORMAT = "0" VALID_CHARS_FILE = "/usr/local/share/infomap-nlp/valid_chars.en" STOPLIST_FILE = "/usr/local/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE = "0" COL_LABEL_FILE = "" echo "Making datadir" Making datadir mkdir -p /tmp/gb ================================================== Building target: /tmp/gb/dic Prerequisites: 1.html Wed May 19 10:55:14 CEST 2004 .................................................. prepare_corpus \ -cdir "." \ -mdir "/tmp/gb" \ -cfile "1.html" \ -fnfile "" \ -chfile "/usr/local/share/infomap-nlp/valid_chars.en" \ -slfile "/usr/local/share/infomap-nlp/stop.list" \ -rptfile "" Locale set to (null). Opening File for "r": "/usr/local/share/infomap-nlp/valid_chars.en" Opening File for "r": "" my_fopen: No such file or directory Opening File for "r": "/usr/local/share/infomap-nlp/stop.list" Opening File for "w": "/tmp/gb/wordlist" Opening File for "r": "1.html" Opening File for "w": "/tmp/gb/numDocs" Typecount = 0 Preparing to sort ... Sorting ... Done. Opening File for "w": "/tmp/gb/dic" .................................................. Finishing target: /tmp/gb/dic ================================================== ================================================== Building target: /tmp/gb/coll Prerequisites: /tmp/gb/wordlist /tmp/gb/dic /tmp/gb/numDocs Wed May 19 10:55:14 CEST 2004 .................................................. count_wordvec \ -mdir /tmp/gb \ -matlab 0 \ -precontext 15 \ -postcontext 15 \ -rows 20000 \ -columns 1000 \ -col_labels_from_file 0 \ -col_label_file "" model data dir is "/tmp/gb". count_wordvec.c: looking for 0 rows which had better match 0 Reading the dictionary... Opening File for "r": "/tmp/gb/dic" Opening File for "r": "/tmp/gb/numDocs" Initializing row indices...Done. Initializing column indices...Done. Allocating matrix memory...done. Initializing matrix...done. model data dir is "/tmp/gb". count_wordvec.c: about to call process_wordlist Entering process_wordlist. About to call initialize_wordlist. Opening File for "r": "/tmp/gb/wordlist" Returned from initialize_wordlist. Writing the co-occurrence matrix. Entering write_matrix_svd; rows = 0 and columns = 1000. Opening File for "w": "/tmp/gb/coll" Opening File for "w": "/tmp/gb/indx" .................................................. Finishing target: /tmp/gb/coll ================================================== ================================================== Building target: /tmp/gb/left Prerequisites: /tmp/gb/coll /tmp/gb/indx Wed May 19 10:55:14 CEST 2004 .................................................. cd /tmp/gb && rm svd_diag left \ rght sing rm: svd_diag: No such file or directory rm: left: No such file or directory rm: rght: No such file or directory rm: sing: No such file or directory gmake: [/tmp/gb/left] Error 1 (ignored) cd /tmp/gb && svdinterface \ -singvals 100 \ -iter 100 This is svdinterface. Writing to: left Writing to: rght Writing to: sing Writing to: svd_diag Reading: indx Reading: indx Reading: coll FEWER THAN EXPECTED SINGULAR VALUES .................................................. Finishing target: /tmp/gb/left ================================================== ================================================== Building target: /tmp/gb/wordvec.bin Prerequisites: /tmp/gb/left /tmp/gb/dic Wed May 19 10:55:14 CEST 2004 .................................................. encode_wordvec \ -m /tmp/gb Opening File for "r": "/tmp/gb/left" Can't Open /tmp/gb/word2offset Failed dbm_open(): Invalid argument gmake: *** [/tmp/gb/wordvec.bin] Error 1 ------------------------------- next error msg ------------------------------------- 3) running infmap -m again without clearing the working dir venus% infomap-build -m gb-corpus.txt gb Sourcing param file "/usr/local/share/infomap-nlp/default-params" Sourcing extra param file "/tmp/infomap-build.lRMTsa" Contents are: Removing extra param file WORKING_DATA_DIR = "/tmp/gb" CORPUS_DIR = "." CORPUS_FILE = "" FNAMES_FILE = "gb-corpus.txt" ROWS = "20000" COLUMNS = "1000" SINGVALS = "100" SVD_ITER = "100" PRE_CONTEXT_SIZE = "15" POST_CONTEXT_SIZE = "15" WRITE_MATLAB_FORMAT = "0" VALID_CHARS_FILE = "/usr/local/share/infomap-nlp/valid_chars.en" STOPLIST_FILE = "/usr/local/share/infomap-nlp/stop.list" COL_LABELS_FROM_FILE = "0" COL_LABEL_FILE = "" echo "Making datadir" Making datadir mkdir -p /tmp/gb ================================================== Building target: /tmp/gb/wordvec.bin Prerequisites: /tmp/gb/left /tmp/gb/dic Wed May 19 10:52:06 CEST 2004 .................................................. encode_wordvec \ -m /tmp/gb Opening File for "r": "/tmp/gb/left" Can't Open /tmp/gb/word2offset Failed dbm_open(): Invalid argument gmake: *** [/tmp/gb/wordvec.bin] Error 1 venus% |
From: Dominic W. <dwi...@cs...> - 2004-05-06 15:38:00
|
> There must be an easier way, but I think not many people will be > interested in the raw document vectors (or am I wrong)? Hi Menno, It sounds like your work-around to get the document vectors is pretty effective, though as you say there should be an easier way. For word and query vectors there's an "associate -q" option which simply prints out the query vector rather than performing a search. One way I've often used to get document vectors is simply to pass the whole document as an argument to "associate -q", which is pretty unsatisfactory though it does have the benefit that you can get document vectors for textfiles that weren't in your original corpus. If the "associate -q" option was combined with the "associate_doc" function Beate described, this would solve the problem properly, and I could see benefits to making this available (eg. for work on document clustering). It sounds as though you've already got a workable solution, but if enough other people on the list express an interest we should look into it. I'm delighted to hear about people using the infomap software as part of a richer and more complex system of features - I'd be interested to hear more about your work whenever you are ready. Best wishes, Dominic |
From: Menno v. Z. <M.M...@uv...> - 2004-05-05 16:48:52
|
Dear Beate, * Beate Dorow <do...@IM...> wrote on [2004-05-05 18:13]: > Are looking for a way of using a document (or combination of > documents) as query to look for related documents or words ? > If so, we have an implementation of this feature in a modified version of > the infomap-nlp code which we can add on to the package. > It works similar to associate, e.g. > "associate_doc -d doc1 doc2 ... docN NOT docN+1 ... docN+k" > will return documents which are similar to doc1 ... docN but are unrelated > to docN+1 ... docN+k. > > Or are you just looking for a way of retrieving the vector associated with > a certain document? I'm actually building a more complex system where I would like to use the document feature vectors in combination with a lot of other features and compare these ``extended'' feature vectores (by computing distances). What I've done now is changing the neighbors.c file so that when computing distances, I simply print out the document vectors. Next, I've changed associate.c to print out the document names (once computing the distances is done). (Oh, I've also changed some constants to print out information of all documents.) This gives me a file like: <feature vector> <feature vector> <feature vector> ... <document name> <document name> <document name> ... I analyse this file to get the right feature vector of the corresponding document and I'm done. There must be an easier way, but I think not many people will be interested in the raw document vectors (or am I wrong)? Best regards, Menno ------------------------------- - Menno van Zaanen - They can't stop us, - mvz...@uv... - we're on a mission from God! - http://ilk.uvt.nl/~mvzaanen - -The Blues Brothers ------------------------------- |
From: Beate D. <do...@IM...> - 2004-05-05 16:03:56
|
Dear Menno, Are looking for a way of using a document (or combination of documents) as query to look for related documents or words ? If so, we have an implementation of this feature in a modified version of the infomap-nlp code which we can add on to the package. It works similar to associate, e.g. "associate_doc -d doc1 doc2 ... docN NOT docN+1 ... docN+k" will return documents which are similar to doc1 ... docN but are unrelated to docN+1 ... docN+k. Or are you just looking for a way of retrieving the vector associated with a certain document? Best wishes, Beate On Tue, 4 May 2004, Menno van Zaanen wrote: >Hello, > >I would like to get the document vectors that infomap computes for >each document. I want to have these to compute my own distances >between documents (instead of computing distances between queries and >documents). > >Is there an easy way to extract the document vectors? Of course I >would also like to know which document the document vector belongs to. > >Best, > >Menno > >------------------------------- Structural linguistics is a bitterly divided >- Menno van Zaanen - and unhappy discipline, and a large number of >- mvz...@uv... - practitioners spend too many nights drowning >- http://ilk.uvt.nl/~mvzaanen - their problems in Ouisghian Zodahs. >------------------------------- -Hitchhikers Guide to the Galaxy > > >------------------------------------------------------- >This SF.Net email is sponsored by: Oracle 10g >Get certified on the hottest thing ever to hit the market... Oracle 10g. >Take an Oracle 10g class now, and we'll give you the exam FREE. >http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >_______________________________________________ >infomap-nlp-users mailing list >inf...@li... >https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |
From: Menno v. Z. <M.M...@uv...> - 2004-05-04 14:30:33
|
Hello, I would like to get the document vectors that infomap computes for each document. I want to have these to compute my own distances between documents (instead of computing distances between queries and documents). Is there an easy way to extract the document vectors? Of course I would also like to know which document the document vector belongs to. Best, Menno ------------------------------- Structural linguistics is a bitterly divided - Menno van Zaanen - and unhappy discipline, and a large number of - mvz...@uv... - practitioners spend too many nights drowning - http://ilk.uvt.nl/~mvzaanen - their problems in Ouisghian Zodahs. ------------------------------- -Hitchhikers Guide to the Galaxy |
From: Beate D. <do...@IM...> - 2004-04-26 13:32:11
|
Dear German, There was a bug in the commandline processing of print_doc. I just committed a new version to the CVS repository. Let me know if you still have difficulties running print_doc. We'll soon post a new official release with the fixes, too. Best wishes, Beate >Dear Dominic, > >Thanks a lot, however, it seems that the new version still not solve the >problem. > >[rigau@meaning infomap-nlp-0.8.5]$ associate -c kjbible god >god:1.000000 >lord:0.670154 >disgrace:0.617541 >scoffers:0.567710 >deceivableness:0.553055 >... > >[rigau@meaning infomap-nlp-0.8.5]$ associate -d -c kjbible god >3999484:0.800569 >4300759:0.799777 >4175212:0.796587 >4179848:0.795709 >2290800:0.795502 >... > >but > >[rigau@meaning infomap-nlp-0.8.5]$ print_doc -c kjbible 3999484 >model_params.c: read_corpus_format(): can't open file: No such file or >directory >Can't read corpus format file >"/home/rigau/tools/infomap-nlp-0.8.5/models/kibib/corpus_format.bin" > >when nothing has been said about a "kibib" model. It do not exists at >all. There is only >the kjbible in the models folder. Obviously, it is a path problem... or >maybe I'm doing >something wrong... > >Best wishes, > >German > > > > >------------------------------------------------------- >This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek >For a limited time only, get FREE Ground shipping on all orders of $35 >or more. Hurry up and shop folks, this offer expires April 30th! >http://www.thinkgeek.com/freeshipping/?cpg=12297 >_______________________________________________ >infomap-nlp-users mailing list >inf...@li... >https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |
From: German R. <ri...@ls...> - 2004-04-24 11:30:26
|
Dominic Widdows wrote: >Dear German, > >This seems to have been a pathnames problem: adding the line > > model_data_dir = strcat( model_dir_parent, model_tag ); > >seems to have fixed it (and another ++argv; --argc; has prevented the >printing of trivial files at the beginning of the results. > >I've released a version 0.8.5 with these fixes - please let me know if >this solves it for you. > > Dear Dominic, Thanks a lot, however, it seems that the new version still not solve the problem. [rigau@meaning infomap-nlp-0.8.5]$ associate -c kjbible god god:1.000000 lord:0.670154 disgrace:0.617541 scoffers:0.567710 deceivableness:0.553055 ... [rigau@meaning infomap-nlp-0.8.5]$ associate -d -c kjbible god 3999484:0.800569 4300759:0.799777 4175212:0.796587 4179848:0.795709 2290800:0.795502 ... but [rigau@meaning infomap-nlp-0.8.5]$ print_doc -c kjbible 3999484 model_params.c: read_corpus_format(): can't open file: No such file or directory Can't read corpus format file "/home/rigau/tools/infomap-nlp-0.8.5/models/kibib/corpus_format.bin" when nothing has been said about a "kibib" model. It do not exists at all. There is only the kjbible in the models folder. Obviously, it is a path problem... or maybe I'm doing something wrong... Best wishes, German |