From: Beate D. <do...@IM...> - 2004-10-28 11:27:46
|
Hi Mich, The "-i" option of associate has to be followed by either "w" (words) or "d" (documents). So the correct way to call associate when using a document as a query is: associate -d -w <model_dir> -c <corpus> -i d <doc_id> where doc_id is the document identifier of one of the documents in the corpus. Please note that the option "-i d" works only, if the document <doc_id> is part of the corpus which you used for building the model, and then doc_id has to be the offset of the document in the corpus (the number inclosed in <f></f> in the "wordlist" file). As far as I understand, you would however like to compare text files to the corpus which are not part of the corpus itself. In this case, as Dominic suggested, you can simply hand over the complete text stream to associate, i.e. associate -d w <model_dir> -c <corpus> -i w <word1> <word2> <word3> ... where the <wordi> form the text stream of your query file. Best, Beate On Wed, 27 Oct 2004, Mich wrote: > Hello Beate, Dominic, > > Thank you both for your kind and helpful answers. I've tried what you > suggest, but it seems the associate command disagrees with me. I have > a single-file corpus - typically called 'many' and have another > document (text-file in the directory), called 'aow' (from sun tzu's > art of war). Since the model already works ("associate -t -c many war" > returns entries such as 'barbaric' and 'pointless' as highly similar), > i was wondering what it could be. > > When I try > "associate -d -c many -i aow.txt" > In which i would have wished the program returned a list of similarity > indices between the text-file and the corpus. Since aow.txt is a > document within the corpus, ideally, the program would return an index > of 1.000 with one document in the corpus. If i would guess, Homer's > Odyssee would be high also, as i would see it as remotely similar > (they are both related to war). > > Anyway, when i try this in whatever order, i keep getting as output: > "Bad option: -i" > > When I just leave -i out, it returns > Am I using a different version (i think i have the latest)? If you > could help me, could you please state the syntax using the example i > just gave? Thank you. > > Dominic: thanks for your suggestions as well. If I could have this > program in a windows environment, I could easily program something > as you suggested, but although i often wish otherwise, i am not a > programmer. And what experience i do have, is totally unrelated to > unix environments. However, it seems that cygwin compiles a few .exe binary > files, among which, associate. Do you think it is possible to use > these from within windows (cmd)? Is there a particular reason no > binaries of infomap-nlp were included? > > Hopefully i haven't offended anyone with bring microsoft software to > this discussion! > > Cheers, > > Mich > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > infomap-nlp-users mailing list > inf...@li... > https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |