I just started playing with pyMPI and found that the following issue makes debugging inconvenient: if one pyMPI task raises an exception and terminates early, it's easy to get a deadlock (e.g. if another task calls MPI_Recv expecting a matching send from the task that raised the exception).
What do you guys think of this patch, which calls MPI_Abort() instead of MPI_Finalize() if the Python interpreter exits on an uncaught exception (or call to sys.exit with nonzero argument)?
I just started playing with pyMPI and found that the following issue makes debugging inconvenient: if one pyMPI task raises an exception and terminates early, it's easy to get a deadlock (e.g. if another task calls MPI_Recv expecting a matching send from the task that raised the exception).
What do you guys think of this patch, which calls MPI_Abort() instead of MPI_Finalize() if the Python interpreter exits on an uncaught exception (or call to sys.exit with nonzero argument)?
--- pyMPI-2.5b0/pyMPI_main.c 2006-10-30 18:50:22.000000000 +0000
+++ pyMPI-2.5b0-kms/pyMPI_main.c 2009-03-22 14:54:06.000000000 +0000
@@ -71,6 +71,10 @@
#endif
if ( Python_owns_MPI && !finalized ) {
+ if (status) {
+ fprintf(stderr, "pyMPI: exception raised, or sys.exit() called with nonzero argument: calling MPI_Abort()\n");
+ MPI_Abort(MPI_COMM_WORLD, status);
+ }
MPI_Finalize();
}
This sounds good... I already do something similar for SEGV to get a cleaner shutdown.
This is easy to put in and should help everyone.
Pat
It's in CVS now... will make it into the next release