From: Dominic W. <wi...@ma...> - 2006-07-15 16:37:46
|
Thanks David for jumping in. I heartily encourage and appreciate people taking the Infomap software into their own hands and answering questions, and I'm sure the other original developers from the project at CSLI agree. My latest and greatest excuse for being too busy to devote much attention to maintaining the software is my one week old daughter Elinor, who is sitting on my knee as I scribble a frantic message. So my availability isn't going to be increasing any time soon, that's for sure! I agree with the solutions you posted below, couldn't have done better myself. It's possible to set COLUMNS in the default-params file, I don't know of any flag that prevents the number from being greater than 2999. Nor do I know off hand whether the matrix format in memory is scalable enough to handle a full term-by-term matrix. But this is certainly the place to begin looking. Good luck, and thanks for using the software and contributing any insight. Best wishes, Dominic On Jul 14, 2006, at 9:33 PM, David Hall wrote: > On 7/14/06, Tonio Wandmacher <ton...@un...> > wrote: >> first of all: Thank you very much for making your software available. >> It >> spares me a lot of work! >> I have two questions concerning the model construction. >> >> 1. Is there a switch to allow upper case characters without modifying >> the >> code? Even though I have "A-Z" included in valid.chars list, all >> words are >> transformed to lower case. > > Before I say anything, despite my email address, I'm not affiliated > with this project, so take everything I say with a grain of salt. That > said, if you look at > > preprocessing/tokenizer.c:249 (in some version...) you'll see a call > to "tolower", simply remove the call, and that should help you out. > Devs, if you read this, would there be interest in making this a > switch? I can try to write up a patch... > >> >> 2. Is there a reason why the maximum number of columns is set to 2999 >> ? I >> would like to try out squared termxterm matrices, as used by Reinhard >> Rapp >> (2003), who got excellent results in the TOEFL synonym test. >> Does anyone have experiences about the optimal number of columns? >> > > Have you looked at the default.params file? Otherwise, I don't know > exactly what's going on here. > > HTH, > David > > > ----------------------------------------------------------------------- > -- > Using Tomcat but need to do more? Need to support web services, > security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > infomap-nlp-users mailing list > inf...@li... > https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users > |