|
From: Joshua A. <lu...@ln...> - 2005-10-21 16:04:32
|
On Fri, 2005-10-21 at 03:20 -0400, Shobana Ravi wrote: > I am working on the Nimbus 4 system based on Clustermatic on a 15 node > cluster of Opterons. I am having intermittent problems running MPI > applications on this. > > While trying to run the the ASCI benchmark sPPM, 4 times out of 5, I get > errors that look like this : > > p3_16410: p4_error: interrupt SIGSEGV: 11 > > > If anybody could help me with this problem, or give me pointers as to what > to explore/debug, I would greatly appreciate it. Build your app with debugging symbols, get the task on the node to dump a core file when it crashes, copy the core file back to the host node and analyze using gdb. |