From: Dominic W. <dwi...@cs...> - 2004-05-06 15:38:00
|
> There must be an easier way, but I think not many people will be > interested in the raw document vectors (or am I wrong)? Hi Menno, It sounds like your work-around to get the document vectors is pretty effective, though as you say there should be an easier way. For word and query vectors there's an "associate -q" option which simply prints out the query vector rather than performing a search. One way I've often used to get document vectors is simply to pass the whole document as an argument to "associate -q", which is pretty unsatisfactory though it does have the benefit that you can get document vectors for textfiles that weren't in your original corpus. If the "associate -q" option was combined with the "associate_doc" function Beate described, this would solve the problem properly, and I could see benefits to making this available (eg. for work on document clustering). It sounds as though you've already got a workable solution, but if enough other people on the list express an interest we should look into it. I'm delighted to hear about people using the infomap software as part of a richer and more complex system of features - I'd be interested to hear more about your work whenever you are ready. Best wishes, Dominic |