|
From: Andrew P. <ap...@ro...> - 2005-09-27 16:41:36
|
Greg, Yes, using bjs, I've now added the (-s numberofseconds) argument and that did it, thank you very much for your help! - Andrew On Sep 27, 2005, at 11:50 AM, Greg Watson wrote: > Are you using bjs to allocate the nodes? The default allocation > time is 1 second I think. > > Greg > > On Sep 27, 2005, at 8:34 AM, Andrew Pitre wrote: > > >> I'm having trouble getting mpi programs to execute for >= 1 sec. >> I have a simple program that loops, prints the execution time then >> quits. >> >> When the loop count is increased to where the execution time is >> greater than or about 1 sec, the program fails with the following >> messages: >> "mpirun: error: child process (rank=0; node=0) exited abnormally. >> mpirun: error: aborting." >> >> Replacing the loop with a sleep() statement has a similar effect, >> processes can sleep for any amount of time < 1 sec, e.g. sleep(. >> 999999) is ok, but if sleep(1) is called the program fails with >> the above error. >> >> I've tried adjusting the pingtimeout with settings 30, 3000, and >> 30000, without success. >> >> The environment is Clustermatic 5 with a custom compiled 2.6.9 >> kernel and bproc4.0.0pre8 module on Opteron processors. This >> problem does not appear on a LAM based non-bproc cluster with the >> same source code. >> >> Any help with this will be greatly appreciated. >> >> - Andrew >> >> >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: >> Tame your development challenges with Apache's Geronimo App >> Server.Download it for free - -and be entered to win a 42" plasma >> tv or your very >> own Sony(tm)PSP. Click here to play: http://sourceforge.net/ >> geronimo.php >> _______________________________________________ >> BProc-users mailing list >> BPr...@li... >> https://lists.sourceforge.net/lists/listinfo/bproc-users >> >> > > |