From: David D. <dd...@pl...> - 2013-09-20 18:24:34
|
That works fine. I'll send the webex info after lunch. David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: Rohan [mailto:ro...@cc...] Sent: Friday, September 20, 2013 9:38 AM To: David Davies Cc: dmt...@li...; Kapil Arya Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue How about 3-4 PM today; does that work for you? Thanks, Rohan On Fri, Sep 20, 2013 at 12:23:48AM +0000, David Davies wrote: > Hi Rohan, > > Anytime Friday afternoon is ok too. Please suggest a time so I can setup webex meeting. Thanks. > > David Davies > ASIC Verification Project Manager, PLX Technology, Inc. > 408-962-3474 > > > -----Original Message----- > From: David Davies > Sent: Wednesday, September 18, 2013 4:22 PM > To: 'Rohan' > Cc: dmt...@li...; Kapil Arya > Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue > > Thanks. Anytime tomorrow is fine. > > David Davies > ASIC Verification Project Manager, PLX Technology, Inc. > 408-962-3474 > > > -----Original Message----- > From: Rohan [mailto:ro...@cc...] > Sent: Wednesday, September 18, 2013 3:04 PM > To: David Davies > Cc: dmt...@li...; Kapil Arya > Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue > > Hi David, > > I have been trying to reproduce the "gettime" error on my system here but I haven't encountered that, with the trunk or with dmtcp-1.2.8. > > The "qemu_cond_wait" error is a known race condition. It has been a particularly hard one to keep track of. If it is occurring frequently for you we should look at it. > > I think a webex session would be more efficient. We can have a session tomorrow before afternoon, or Friday late in the afternoon. > > Thanks, > Rohan > > On Wed, Sep 18, 2013 at 07:10:08PM +0000, David Davies wrote: > > Hi Rohan, > > > > I tried with svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and got different type errors with different checkpoints. Here are a few of them. > > If you think it would help, I can host a webex and we can see it interactively? > > > > Synopsys VCS: > > [95000] ERROR at fileconnection.cpp:693 in refill; REASON='JASSERT(jalib::Filesystem::FileExists(_path)) failed' > > _path = /proc/self/exe > > Message: File not found. > > dpi_sim_tb (95000): Terminating... > > > > QEMU: > > [96000] WARNING at jsocket.cpp:289 in readAll; REASON='JWARNING(cnt>=0) failed' > > sockfd() = 12 > > cnt = -1 > > len = 128 > > (strerror((*__errno_location ()))) = Connection reset by peer > > Message: JSocket read failure > > [96000] ERROR at connectionidentifier.h:96 in assertValid; REASON='JASSERT(strcmp(sign, HANDSHAKE_SIGNATURE_MSG) == 0) failed' > > sign = > > Message: read invalid message, signature mismatch. (External > > socket?) > > qemu-system-i386 (96000): Terminating... > > > > > > David Davies > > ASIC Verification Project Manager, PLX Technology, Inc. > > 408-962-3474 > > > > > > -----Original Message----- > > From: David Davies > > Sent: Tuesday, September 17, 2013 4:37 PM > > To: 'Rohan Garg' > > Cc: dmt...@li...; Kapil Arya > > Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue > > > > Hi Rohan, > > > > Thanks for assistance and sorry for the delayed response. > > No, it is not the exact same QEMU error every time with different checkpoint images. For example, I also see this type " qemu: qemu_cond_wait: Operation not permitted". > > I'll try the version Gene suggested at svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and let you know. > > > > [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > > -j ckpt_qemu-system-i386_1d4a8584596cf84-17103-5238e56e.dmtcp > > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 Copyright (C) 2006-2011 Jason > > Ansel, Michael Rieker, Kapil Arya, and > > Gene Cooperman This program comes with ABSOLUTELY NO WARRANTY. > > This is free software, and you are welcome to redistribute it under certain conditions; see COPYING file for details. > > (Use flag "-q" to hide this message.) > > > > [17103] mtcp_restart_nolibc.c:160 mtcp_restoreverything: > > error: new/current break (0x60C000) != saved break > > (0x7FFB36621000) > > qemu: qemu_cond_wait: Operation not permitted Abort [root@demeter > > qemu-1.2.0]# > > > > > > David Davies > > ASIC Verification Project Manager, PLX Technology, Inc. > > 408-962-3474 > > > > > > -----Original Message----- > > From: Rohan Garg [mailto:ro...@cc...] > > Sent: Thursday, September 12, 2013 11:31 PM > > To: David Davies > > Cc: dmt...@li...; Kapil Arya > > Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue > > > > The exit that you are seeing now is caused by QEMU. It could be that DMTCP is not restoring the state of a timer. I'm trying to reproduce the issue here. > > > > Do you see the same issue every time, that is, with different checkpoint images? > > > > ----- Original Message ----- > > From: "David Davies" <dd...@pl...> > > To: "Rohan" <ro...@cc...> > > Cc: dmt...@li..., "Kapil Arya" > > <ka...@cc...> > > Sent: Thursday, September 12, 2013 4:09:09 PM GMT -05:00 US/Canada > > Eastern > > Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue > > > > Hi Rohan, > > > > Thanks for the quick response. After the change and re-compile, it hits this issue when restarting. > > > > [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > > ckpt_qemu-system-i386_1d4a8584596cf84-25076-5232172e.dmtcp > > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 Copyright (C) 2006-2011 Jason > > Ansel, Michael Rieker, Kapil Arya, and > > Gene Cooperman This program comes with ABSOLUTELY NO WARRANTY. > > This is free software, and you are welcome to redistribute it under certain conditions; see COPYING file for details. > > (Use flag "-q" to hide this message.) > > > > dmtcp_coordinator starting... > > Port: 7779 > > Checkpoint Interval: disabled (checkpoint manually instead) > > Exit on last client: 1 > > Backgrounding... > > [25076] mtcp_restart_nolibc.c:160 mtcp_restoreverything: > > error: new/current break (0x60C000) != saved break > > (0x7FE932274000) > > gettime: Invalid argument > > Internal timer error: aborting > > [root@demeter qemu-1.2.0]# > > > > David Davies > > ASIC Verification Project Manager, PLX Technology, Inc. > > 408-962-3474 > > > > > > -----Original Message----- > > From: Rohan [mailto:ro...@cc...] > > Sent: Thursday, September 12, 2013 11:37 AM > > To: David Davies > > Cc: dmt...@li...; Rohan Garg; Kapil Arya > > Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue > > > > Hi David, > > > > Could you please comment out the following line in $DMTCP_SRC_DIR/mtcp/mtcp_restart_nolibc.c, re-compile, and test: > > > > 157 else { > > 158 if (new_brk == current_brk) > > 159 MTCP_PRINTF("error: new/current break (%p) != saved break (%p)\n", > > 160 current_brk, mtcp_saved_break); > > 161 else > > 162 MTCP_PRINTF("error: new break (%p) != current break (%p)\n", > > 163 new_brk, current_brk); > > 164 // mtcp_abort (); /* COMMENT THIS LINE */ > > 165 } > > > > This should fix the problem without affecting other functionality. > > > > Thanks, > > Rohan > > > > On Thu, Sep 12, 2013 at 11:06:38AM -0400, Kapil Arya wrote: > > > Hi Rohan, > > > > > > Can you take an quick look and see what is going on? > > > > > > thanks, > > > Kapil > > > > > > > > > On Thu, Sep 12, 2013 at 10:51 AM, David Davies <dd...@pl...> wrote: > > > > > > > Hi,**** > > > > > > > > ** ** > > > > > > > > I’m trying to checkpoint and restart a Synopsys VCS simulation > > > > that runs concurrently with a Virtual Machine (QEMU) that > > > > communicate together over TCP sockets.**** > > > > > > > > I can successfully checkpoint and restart the Synopsys VCS > > > > simulation alone, but not the QEMU. The restart of QEMU gives > > > > the > > > > following:**** > > > > > > > > ** ** > > > > > > > > ** ** > > > > > > > > [root@demeter qemu-1.2.0]# > > > > /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > > > > ckpt_qemu-system-i386_1d4a8584596cf84-15246-5231ce7f.dmtcp**** > > > > > > > > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8**** > > > > > > > > Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil > > > > Arya, > > > > and**** > > > > > > > > Gene > > > > Cooperman**** > > > > > > > > This program comes with ABSOLUTELY NO WARRANTY.**** > > > > > > > > This is free software, and you are welcome to redistribute > > > > it**** > > > > > > > > under certain conditions; see COPYING file for details.**** > > > > > > > > (Use flag "-q" to hide this message.)**** > > > > > > > > ** ** > > > > > > > > [15246] mtcp_restart_nolibc.c:160 mtcp_restoreverything:**** > > > > > > > > error: new/current break (0x60C000) != saved break > > > > (0x7F5F3751D000)**** > > > > > > > > Segmentation fault**** > > > > > > > > [root@demeter qemu-1.2.0]#**** > > > > > > > > ** ** > > > > > > > > ** ** > > > > > > > > ** ** > > > > > > > > Machine details:**** > > > > > > > > ---------------------**** > > > > > > > > [root@demeter qemu-1.2.0]# uname -a**** > > > > > > > > Linux demeter 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 > > > > GMT > > > > 2011 > > > > x86_64 x86_64 x86_64 GNU/Linux**** > > > > > > > > ** ** > > > > > > > > [root@demeter qemu-1.2.0]# cat /etc/redhat-release **** > > > > > > > > CentOS release 6.2 (Final)**** > > > > > > > > ** ** > > > > > > > > Any advice would be greatly appreciated.**** > > > > > > > > ** ** > > > > > > > > David Davies**** > > > > > > > > ASIC Verification Project Manager, PLX Technology, Inc.**** > > > > > > > > 408-962-3474**** > > > > > > > > ** ** > > > > > > > > > > > > ---------------------------------------------------------------- > > > > -- > > > > -- > > > > ---------- How ServiceNow helps IT people transform IT departments: > > > > 1. Consolidate legacy IT systems to a single system of record > > > > for IT 2. Standardize and globalize service processes across IT 3. > > > > Implement zero-touch automation to replace manual, redundant > > > > tasks > > > > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ > > > > os tg .clktrk _______________________________________________ > > > > Dmtcp-forum mailing list > > > > Dmt...@li... > > > > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > > > > > > > > |