Well... it has to do with the craziness of me swapping and unswapping MPI communicators to get libMesh to work on sub-communicators ;-)

When I swap on one processor it sets both COMM_WORLD and CommWorld to the sub-communicator.  When I swap back it sets them both to what COMM_WORLD was before the swap... so on any processor that swapped and swapped back COMM_WORLD would match CommWorld.

If any processor didn't do any sub-solve then it won't swap at all... and so it has a mismatched COMM_WORLD and CommWorld.... so now the CommWorlds on processors that swapped won't match the CommWorlds on processors that don't swap..... and the next thing done using CommWorld will hang.

Thoroughly confused?  This is exactly why Ben's branch is so damn important ;-)


On Thu, Apr 4, 2013 at 3:17 PM, Roy Stogner <roystgnr@ices.utexas.edu> wrote:

On Thu, 4 Apr 2013, Derek Gaston wrote:

This has actually caused a bug that I've been trying to track down... just switching that last line to:

      Parallel::Communicator_World = libMesh::COMM_WORLD;

How'd that bug manifest?  You had some processors trying to
participate in a communication via COMM_WORLD and others trying to
participate in the same communication via CommWorld?

Go ahead and commit that fix to master...