Hi Sverker,
Sorry not to reply for a while, I'm horribly busy. (again. long story)
> Dear "infomap-nlp-users" mailing list,
>
> I have some questions:
>
> 1. How do I know which documents have which ID?
I believe there's a print_doc function that will take a document ID and
return the document.
> 2. I need the coordinate vectors for *all* documents and *all* words.
> Is there a simple way to get these data?
I believe Scott implemented something that did this over the summer. I
don't think it's in the released version, but should be under CVS on
Sourceforge. Scott may be able to help more.
> 3. Which is the best way to exclude words with low frequency from the
> model (before the diagonalization)?
I think you set the ROWS variable in /admin/default-params
> I'm trying to use infomap to visualize relations between documents, in
> the style of Multiple Correspondence Analysis.
Very cool. Let us know if you get interesting results.
Again, apologies for not giving more detailed answers. I hope this
helps somewhat.
Best wishes,
Dominic
|