From: Deane, P. <pd...@et...> - 2005-05-19 21:02:19
|
Yes, cc'ing this discussion to the infomap list would be excellent. Daniel, could you give them the breakdown on the crashes we've been experiencing? -----Original Message----- From: Scott Cederberg [mailto:ced...@gm...] Sent: Thursday, May 19, 2005 4:59 PM To: Deane, Paul Cc: Dominic Widdows; bea...@im...; Zuckerman, Daniel; inf...@li... Subject: Re: Infomap & 64-bit Hi Paul, I'd be happy to be put in touch with Daniel to see if we can replicate the problem for debugging. If you're comfortable carrying on this discussion on the infomap-nlp-users list, let's do that so it can be useful for other people. (I'd been ignoring the lists for a while but am now reading them again.) It's going to take me a little while to refresh my memory as to how all this stuff works, but I'd like to root out the problem so the software can scale to 64-bit systems in the future. Scott On 5/18/05, Deane, Paul <pd...@et...> wrote: > Here's my original post to the infomap list on this issue. We're > running Gentoo linux, and one problem we encountered when trying to > get it to compile was that the paths for ndbm.h didn't match the setup > on our machine, so we had to create a symbolic link -- so that's a > possible compatibility issue. However, the crash we get on the 64 bit > machine's down in svdinterface. > > I'm cc'ing the programmer who's been working on this, Daniel > Zuckerman, to keep him in the loop. Thanks if you're able to help > here. Ideal would be if Daniel could send you the details of the > scaleup problems we've had and see if you could replicate them -- we > had a problem on 32-bit when we tried to increase the size of the default SVD analysis, also. > > ---------------------------------------------------------------------- > ---- > > Has anyone successfully compiled infomap on 64-bit linux? (AMD 64 > running Debian with 64-bit libraries only). > > When I tried to compile 0.8.5 (using the 64-bit version of GCC), I get > the following error messages during compile: > > myutils.c: In function `mymalloc': > myutils.c:167: error: conflicting types for 'malloc' > make[3]: *** [myutils.o] Error 1 > make[2]: *** [all-recursive] Error 1 > make[1]: *** [all-recursive] Error 1 > > When I comment out the offending line, it compiles, but I get memory > errors when I attempt to run the program, i.e., the end of the log looks like this: > > ================================================== > Building target: /export/home/bragi2/pdeane/test/left > Prerequisites: /export/home/bragi2/pdeane/test/coll > /export/home/bragi2/pdeane/t > est/indx > Fri Apr 29 09:06:56 EDT 2005 > .................................................. > cd /export/home/bragi2/pdeane/test && rm svd_diag left \ > rght sing > rm: cannot remove `svd_diag': No such file or directory > rm: cannot remove `left': No such file or directory > rm: cannot remove `rght': No such file or directory > rm: cannot remove `sing': No such file or directory > make: [/export/home/bragi2/pdeane/test/left] Error 1 (ignored) cd > /export/home/bragi2/pdeane/test && svdinterface \ > -singvals 100 \ > -iter 100 > > This is svdinterface. > > Writing to: left > Writing to: rght > Writing to: sing > Writing to: svd_diag > Reading: indx > Reading: indx > Reading: coll > make: *** [/export/home/bragi2/pdeane/test/left] Error 139 > ---------------------------------------------------------------------- > ------ > ----------------- > > When I check the directory the model was being created in, this is the > state of things, with the evidence suggesting that something when > nastily wrong in the svdinterface code that calls the mymalloc > function: > > -rw-r--r-- 1 pdeane nlp 2819650 Apr 29 09:06 coll > -rw-r--r-- 1 pdeane nlp 374 Apr 29 09:06 corpus_format.bin > -rw-r--r-- 1 pdeane nlp 157223 Apr 29 09:06 dic > -rw-r--r-- 1 pdeane nlp 3939 Apr 29 09:06 indx > -rw-r--r-- 1 pdeane nlp 0 Apr 29 09:06 left > -rw-r--r-- 1 pdeane nlp 28 Apr 29 09:06 model_info.bin > -rw-r--r-- 1 pdeane nlp 16396 Apr 29 09:06 model_params.bin > -rw-r--r-- 1 pdeane nlp 4 Apr 29 09:06 numDocs > -rw-r--r-- 1 pdeane nlp 0 Apr 29 09:06 rght > -rw-r--r-- 1 pdeane nlp 0 Apr 29 09:06 sing > -rw-r--r-- 1 pdeane nlp 0 Apr 29 09:06 svd_diag > -rw-r--r-- 1 pdeane nlp 540345 Apr 29 09:06 wordlist > > I tried changing the defines e.g. BIGINT, BIGFLOAT in fixed.h to > 64-bit values but this makes no difference, and that was the only > obvious 32-bit dependency I could see. > > Interestingly, the original SVDPACKC compiles and runs in my > environment, so whatever is going on looks like it must be a function > of the svdinterface code and associated changes. Also, btw, I have had > no problem running infomap on a 32-bit linux machine nor under cygwin on a windows machine. > > Can anybody help me out here? > > > > > -----Original Message----- > From: Dominic Widdows [mailto:dwi...@cs...] > Sent: Wednesday, May 18, 2005 10:01 AM > To: Deane, Paul > Cc: ced...@gm...; bea...@im... > Subject: Re: Infomap & 64-bit > > Hi Scott, > > Sorry to bother you during your vacation, but do you have any idea why > the infomap software may fail to run on a 64-bit machine? > > BTW, Paul, I believe there is a "matlab file" output option in the > infomap code, which outputs the raw counts to a matlab-compatible > sparse matrix format. This would probably make it easy for you to > separate the problem into counting coocurrences and then performing the SVD in a separate stage. > I'm cc'ing Beate Dorow, since she worked on this a few years ago. > > Best wishes, > Dominic > > On Wed, 18 May 2005, Deane, Paul wrote: > > > Hi, seeing your post today on the infomap mailing list prompts me to > > repeat a query. > > > > I've been trying to get infomap working on a large 64-bit machine > > running Linux, and my post to this list outlined my problems. Reason > > I'm doing this is I want to run analyses on some very large corpora, > > and infomap doesn't seem to scale up well to these very large > > corpora when I run them on our 32-bit machines ... something always > > breaks and we get memory errors when we increase the number of words > > we want to cover and/or the size of the corpus up toward our target, > > which is 100,000 words of vocabulary on a 500-million-word corpus. > > > > Is there a chance of getting any help here? If we can't get past the > > scaleup problems, we'll have to do the analysis outside of infomap, > > i.e., compile the counts and do the SVD indepedently. > > > > Thanks, > > > > Paul Deane > > > > > > > > ******************************************************************** > > ** > > **** This e-mail and any files transmitted with it may contain > > privileged or confidential information. It is solely for use by the > > individual for whom it is intended, even if addressed incorrectly. > > If you received this e-mail in error, please notify the sender; do > > not disclose, copy, distribute, or take any action in reliance on > > the contents of this information; and delete it from your system. > > Any other use of this e-mail is prohibited. Thank you for your compliance. > > > > > > > > > > > > ********************************************************************** > **** This e-mail and any files transmitted with it may contain > privileged or confidential information. It is solely for use by the > individual for whom it is intended, even if addressed incorrectly. If > you received this e-mail in error, please notify the sender; do not > disclose, copy, distribute, or take any action in reliance on the > contents of this information; and delete it from your system. Any > other use of this e-mail is prohibited. Thank you for your compliance. > > > > ************************************************************************** This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. |