From: Phan, L. H (3443) <Lin...@jp...> - 2012-10-23 03:29:49
|
Hi Kapil, I realized I had "handle SIGUSR2 nopass noprint". Now that I set "handle SIGUSR2 noprint nostop pass", I can checkpoint but only once; on the second checkpoint, the a.out just continue running without checkpointing, ie: 1. dmtcp_restart ckpt_a.out_c273c1-96000-50860122.dmtcp 2. gdb a.out 6676 2. dmtcp_command -c --quiet (rewrite ckpt_a.out_c273c1-96000-50860122.dmtcp) 3. dmtcp_command -c --quiet (does NOT rewrite ckpt_a.out_c273c1-96000-50860122.dmtcp) 4. dmtcp_command -c --quiet Error, computation not in running state. Either a checkpoint is currently happening or there are no connected processes. This only happens when I attach with gdb. Without gdb attaching, it will always checkpoint when commanded to. Is there something I'm doing wrong? Thank you, Linh From: Kapil Arya [mailto:ka...@cc...] Sent: Monday, October 22, 2012 2:10 PM To: Phan, Linh H (3443) Cc: dmt...@li... Subject: Re: [Dmtcp-forum] Can't checkpoint a "gdb attach" process when started using dmtcp_restart I see. Did gdb print something about SIGUSR2? Either you need to do "handle SIGUSR2 noprint nostop pass" to tell gdb to pass the SIGUSR2 signal to the inferior without stopping/printing it. Or else, you can manually type in "signal SIGUSR2" whenever gdb prints the information about SIGUSR2. I hope that helps. In case, gdb is not stopping due to SIGUSR2, you should try doing "Ctrl-C" to get gdb prompt and then check the backtrace of the inferior. Kapil On Mon, Oct 22, 2012 at 4:45 PM, Phan, Linh H (3443) <Lin...@jp...<mailto:Lin...@jp...>> wrote: Hi Kapil, I believe I have asked gdb to "continue" the inferior process. These are the steps I used: 1. Run: dmtcp_coordinator 2. Run: dmtcp_restart ckpt_a.out_c273c1-40000-5085ab94.dmtcp 3. After the a.out process has completed restarting, run: gdb a.out 5610 ... Reading symbols from /usr/local/src/dmtcp-trunk/dmtcp/src/../../lib/libmtcp.so.1...done. Loaded symbols for /usr/local/src/dmtcp-trunk/dmtcp/src/../../lib/libmtcp.so.1 0x00007f4dc98e651d in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) continue Continuing. 4. At dmtcp_coordinator window, type "c" c [5588] TRACE at dmtcp_coordinator.cpp:534 in handleUserCommand; REASON='checkpointing...' [5588] NOTE at dmtcp_coordinator.cpp:1372 in startCheckpoint; REASON='starting checkpoint, suspending all nodes' s.numPeers = 1 [5588] NOTE at dmtcp_coordinator.cpp:1374 in startCheckpoint; REASON='Incremented Generation' UniquePid::ComputationId().generation() = 7 If I type "c" again, it will just print (and continue printing the same thing if I continue typing "c"): c [5588] TRACE at dmtcp_coordinator.cpp:534 in handleUserCommand; REASON='checkpointing...' [5588] TRACE at dmtcp_coordinator.cpp:1385 in startCheckpoint; REASON='delaying checkpoint, workers not ready' s.minimumState = WorkerState::RUNNING s.numPeers = 1 Thank you Kapil for taking a look, Linh PS I got my dmtcp-trunk on 2012-09-13. Below is my a.out program: main () { int cnt=0; while (1) { printf ("%d\n",cnt); cnt ++; sleep(1); } } From: Kapil Arya [mailto:ka...@cc...<mailto:ka...@cc...>] Sent: Monday, October 22, 2012 10:28 AM To: Phan, Linh H (3443) Cc: dmt...@li...<mailto:dmt...@li...> Subject: Re: [Dmtcp-forum] Can't checkpoint a "gdb attach" process when started using dmtcp_restart Hi Linh, In order to allow a gdb attached process to checkpoint, you have to ask gdb to "continue" the inferior. That way, the process can receive the checkpoint message from the dmtcp_coordinator. BTW, when you attach to the restarted process, has the process completed restarting? If yes, then you can use "gdb <binary-name> pid" to attach to it. Kapil On Wed, Oct 17, 2012 at 12:38 AM, Phan, Linh H (3443) <Lin...@jp...<mailto:Lin...@jp...>> wrote: Hi, I can't checkpoint a "gdb attach" process when started using dmtcp_restart. The dmtcp_restarted process just hangs when I type "c" in the dmtcp_coordinator window (if I've attached to that dmtcp_restarted process with gdb; if I've not attached with gdb, the "c" commands works fine). I've noticed that QUICK-START mentioned to use "mtcp_restart" and not "dmtcp_restart" when attaching using gdb, but when I do that, I get this error: $ mtcp_restart myprog.dmtcp [0] mtcp_restart.c:219 main: 'myprog.dmtcp' is 'DMTCP_CHECKP', but this restore is 'MTCP64-V1.0' (fd=4) $ Any ideas? Thank you, Linh ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Dmtcp-forum mailing list Dmt...@li...<mailto:Dmt...@li...> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum |