From: Rahul J. <rj...@ya...> - 2005-10-26 06:11:24
|
Hi Dominic, > because the Infomap software relies on cooccurrence > within a > fixed-width window rather than cooccurrence within a > document. Does the "cooccurence within a window" mean "a window within the same document"? In other words, the window doesn't span documents. Is the width set using: PRE_CONTEXT_SIZE and POST_CONTEXT_SIZE? Thanks, Rahul. --- Dominic Widdows <wi...@ma...> wrote: > Dear Rahul, > > Sorry for not getting back to you sooner. > > > Is there a known limit for number of files in a > > multi-document corpus, expecting optimal > performance? > > Is there a known break-down point? > > I know of no maximum number of files for a > multi-document corpus, as > far as the Infomap software is concerned. However, > this isn't because > we've stretched the system and found that it doesn't > break, it's > because we haven't really used it for this very > much. I've only ever > built a couple of multidocument models, and have had > sporadic reports > of this functionality not working at all. If we were > embarking on a > new, well-resourced project, we'd look into this > straight away, > > > Does the uniformity (or lack of it) in the sizes > of > > individual documents in a corpus affect the > quality > > model? > > It certainly matters much less than in a standard > search / LSA engine, > because the Infomap software relies on cooccurrence > within a > fixed-width window rather than cooccurrence within a > document. > > This is discussed properly in Ch6 of Geometry and > Meaning. There's a > sketch on the web at > http://infomap.stanford.edu/book/chapters/chapter6.html > but of course, finding yourself a copy of the book > would be more useful > ;-) > > Best wishes, > Dominic > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Power Architecture Resource Center: Free content, > downloads, discussions, > and more. > http://solutions.newsforge.com/ibmarch.tmpl > _______________________________________________ > infomap-nlp-users mailing list > inf...@li... > https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > __________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com |