#8 OpenMPI restart broken with trunk

closed-fixed
Kapil Arya
None
2010-09-28
2010-09-14
Kapil Arya
No

The orterun process dies with the following error message:

[11531] ERROR at dmtcpmessagetypes.cpp:62 in assertValid; REASON='JASSERT(strcmp ( DMTCP_MAGIC_STRING,_magicBits ) == 0) failed'
_magicBits = mtcp_restart_nolibc: mapping /tmp/openmpi-sessions-kapil@thinkpad_0/20682/1/shared_mem_pool.thinkpad with data from ckpt image
mtcp_restart_nolibc: mapping /tmp/openmpi-sessions-kapil@thinkpad_0/20682/1/shared_mem_btl_module.thinkpad with data from ckpt image
DMTCP_CKPT_V0

Message: read invalid message, _magicBits mismatch. Did DMTCP coordinator die uncleanly?
orterun (11531): Terminating...

Looks like the stderr got messed up somehow.

Discussion

  • Kapil Arya
    Kapil Arya
    2010-09-28

    Fixed in revision 665. Had to do with STDERR fd.

     
  • Kapil Arya
    Kapil Arya
    2010-09-28

    • status: open --> closed-fixed