From: David D. <dd...@pl...> - 2013-09-18 19:10:22
|
Hi Rohan, I tried with svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and got different type errors with different checkpoints. Here are a few of them. If you think it would help, I can host a webex and we can see it interactively? Synopsys VCS: [95000] ERROR at fileconnection.cpp:693 in refill; REASON='JASSERT(jalib::Filesystem::FileExists(_path)) failed' _path = /proc/self/exe Message: File not found. dpi_sim_tb (95000): Terminating... QEMU: [96000] WARNING at jsocket.cpp:289 in readAll; REASON='JWARNING(cnt>=0) failed' sockfd() = 12 cnt = -1 len = 128 (strerror((*__errno_location ()))) = Connection reset by peer Message: JSocket read failure [96000] ERROR at connectionidentifier.h:96 in assertValid; REASON='JASSERT(strcmp(sign, HANDSHAKE_SIGNATURE_MSG) == 0) failed' sign = Message: read invalid message, signature mismatch. (External socket?) qemu-system-i386 (96000): Terminating... David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: David Davies Sent: Tuesday, September 17, 2013 4:37 PM To: 'Rohan Garg' Cc: dmt...@li...; Kapil Arya Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue Hi Rohan, Thanks for assistance and sorry for the delayed response. No, it is not the exact same QEMU error every time with different checkpoint images. For example, I also see this type " qemu: qemu_cond_wait: Operation not permitted". I'll try the version Gene suggested at svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and let you know. [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart -j ckpt_qemu-system-i386_1d4a8584596cf84-17103-5238e56e.dmtcp dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, and Gene Cooperman This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see COPYING file for details. (Use flag "-q" to hide this message.) [17103] mtcp_restart_nolibc.c:160 mtcp_restoreverything: error: new/current break (0x60C000) != saved break (0x7FFB36621000) qemu: qemu_cond_wait: Operation not permitted Abort [root@demeter qemu-1.2.0]# David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: Rohan Garg [mailto:ro...@cc...] Sent: Thursday, September 12, 2013 11:31 PM To: David Davies Cc: dmt...@li...; Kapil Arya Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue The exit that you are seeing now is caused by QEMU. It could be that DMTCP is not restoring the state of a timer. I'm trying to reproduce the issue here. Do you see the same issue every time, that is, with different checkpoint images? ----- Original Message ----- From: "David Davies" <dd...@pl...> To: "Rohan" <ro...@cc...> Cc: dmt...@li..., "Kapil Arya" <ka...@cc...> Sent: Thursday, September 12, 2013 4:09:09 PM GMT -05:00 US/Canada Eastern Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue Hi Rohan, Thanks for the quick response. After the change and re-compile, it hits this issue when restarting. [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart ckpt_qemu-system-i386_1d4a8584596cf84-25076-5232172e.dmtcp dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, and Gene Cooperman This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see COPYING file for details. (Use flag "-q" to hide this message.) dmtcp_coordinator starting... Port: 7779 Checkpoint Interval: disabled (checkpoint manually instead) Exit on last client: 1 Backgrounding... [25076] mtcp_restart_nolibc.c:160 mtcp_restoreverything: error: new/current break (0x60C000) != saved break (0x7FE932274000) gettime: Invalid argument Internal timer error: aborting [root@demeter qemu-1.2.0]# David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: Rohan [mailto:ro...@cc...] Sent: Thursday, September 12, 2013 11:37 AM To: David Davies Cc: dmt...@li...; Rohan Garg; Kapil Arya Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue Hi David, Could you please comment out the following line in $DMTCP_SRC_DIR/mtcp/mtcp_restart_nolibc.c, re-compile, and test: 157 else { 158 if (new_brk == current_brk) 159 MTCP_PRINTF("error: new/current break (%p) != saved break (%p)\n", 160 current_brk, mtcp_saved_break); 161 else 162 MTCP_PRINTF("error: new break (%p) != current break (%p)\n", 163 new_brk, current_brk); 164 // mtcp_abort (); /* COMMENT THIS LINE */ 165 } This should fix the problem without affecting other functionality. Thanks, Rohan On Thu, Sep 12, 2013 at 11:06:38AM -0400, Kapil Arya wrote: > Hi Rohan, > > Can you take an quick look and see what is going on? > > thanks, > Kapil > > > On Thu, Sep 12, 2013 at 10:51 AM, David Davies <dd...@pl...> wrote: > > > Hi,**** > > > > ** ** > > > > I’m trying to checkpoint and restart a Synopsys VCS simulation that > > runs concurrently with a Virtual Machine (QEMU) that communicate > > together over TCP sockets.**** > > > > I can successfully checkpoint and restart the Synopsys VCS > > simulation alone, but not the QEMU. The restart of QEMU gives the > > following:**** > > > > ** ** > > > > ** ** > > > > [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > > ckpt_qemu-system-i386_1d4a8584596cf84-15246-5231ce7f.dmtcp**** > > > > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8**** > > > > Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, > > and**** > > > > Gene > > Cooperman**** > > > > This program comes with ABSOLUTELY NO WARRANTY.**** > > > > This is free software, and you are welcome to redistribute it**** > > > > under certain conditions; see COPYING file for details.**** > > > > (Use flag "-q" to hide this message.)**** > > > > ** ** > > > > [15246] mtcp_restart_nolibc.c:160 mtcp_restoreverything:**** > > > > error: new/current break (0x60C000) != saved break > > (0x7F5F3751D000)**** > > > > Segmentation fault**** > > > > [root@demeter qemu-1.2.0]#**** > > > > ** ** > > > > ** ** > > > > ** ** > > > > Machine details:**** > > > > ---------------------**** > > > > [root@demeter qemu-1.2.0]# uname -a**** > > > > Linux demeter 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT > > 2011 > > x86_64 x86_64 x86_64 GNU/Linux**** > > > > ** ** > > > > [root@demeter qemu-1.2.0]# cat /etc/redhat-release **** > > > > CentOS release 6.2 (Final)**** > > > > ** ** > > > > Any advice would be greatly appreciated.**** > > > > ** ** > > > > David Davies**** > > > > ASIC Verification Project Manager, PLX Technology, Inc.**** > > > > 408-962-3474**** > > > > ** ** > > > > > > -------------------------------------------------------------------- > > ---------- How ServiceNow helps IT people transform IT departments: > > 1. Consolidate legacy IT systems to a single system of record for IT > > 2. Standardize and globalize service processes across IT 3. > > Implement zero-touch automation to replace manual, redundant tasks > > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg > > .clktrk _______________________________________________ > > Dmtcp-forum mailing list > > Dmt...@li... > > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > > > > |