From: Orion P. <or...@co...> - 2012-10-10 17:07:04
|
So I started a bash script via dmtcp_checkoint and I now have: orion 23499 23489 0 10:43 ? 00:00:00 /bin/bash /var/spool/gridengine/castor/job_scripts/27685 ./foo 7200 out.qsub orion 23506 1 0 10:43 ? 00:00:00 /usr/bin/dmtcp_coordinator --port 0 --exit-on-last --interval 0 --background orion 23512 23499 0 10:43 ? 00:00:00 ./foo 7200 out.qsub If I send the USR2 signal with kill -USR2 23499 it doesn't do anything. Is perhaps bash blocking the signal? There are no trap statements in the script. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder Office FAX: 303-415-9702 3380 Mitchell Lane or...@nw... Boulder, CO 80301 http://www.nwra.com |
From: Kapil A. <ka...@cc...> - 2012-10-15 10:37:14
|
Hi Orion, DMTCP uses SIGUSR2 internally for checkpointing purposes. If you want to change this behavior, you can use the --mtcp-checkpoint-signal. Kapil On Wed, Oct 10, 2012 at 1:06 PM, Orion Poplawski <or...@co...>wrote: > So I started a bash script via dmtcp_checkoint and I now have: > > orion 23499 23489 0 10:43 ? 00:00:00 /bin/bash > /var/spool/gridengine/castor/job_scripts/27685 ./foo 7200 out.qsub > orion 23506 1 0 10:43 ? 00:00:00 /usr/bin/dmtcp_coordinator > --port 0 --exit-on-last --interval 0 --background > orion 23512 23499 0 10:43 ? 00:00:00 ./foo 7200 out.qsub > > If I send the USR2 signal with kill -USR2 23499 it doesn't do anything. Is > perhaps bash blocking the signal? There are no trap statements in the > script. > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA, Boulder Office FAX: 303-415-9702 > 3380 Mitchell Lane or...@nw... > Boulder, CO 80301 http://www.nwra.com > > > ------------------------------------------------------------------------------ > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev > _______________________________________________ > Dmtcp-forum mailing list > Dmt...@li... > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > |
From: Orion P. <or...@co...> - 2012-10-15 14:50:28
|
I thought sending USR2 to a dmtcp started process would trigger a checkpoint. I'm not seeing that happen with this test. On 10/15/2012 04:36 AM, Kapil Arya wrote: > Hi Orion, > > DMTCP uses SIGUSR2 internally for checkpointing purposes. If you want to > change this behavior, you can use the --mtcp-checkpoint-signal. > > Kapil > > On Wed, Oct 10, 2012 at 1:06 PM, Orion Poplawski <or...@co... > <mailto:or...@co...>> wrote: > > So I started a bash script via dmtcp_checkoint and I now have: > > orion 23499 23489 0 10:43 ? 00:00:00 /bin/bash > /var/spool/gridengine/castor/job_scripts/27685 ./foo 7200 out.qsub > orion 23506 1 0 10:43 ? 00:00:00 /usr/bin/dmtcp_coordinator > --port 0 --exit-on-last --interval 0 --background > orion 23512 23499 0 10:43 ? 00:00:00 ./foo 7200 out.qsub > > If I send the USR2 signal with kill -USR2 23499 it doesn't do anything. Is > perhaps bash blocking the signal? There are no trap statements in the script. > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA, Boulder Office FAX: 303-415-9702 > 3380 Mitchell Lane or...@nw... <mailto:or...@nw...> > Boulder, CO 80301 http://www.nwra.com > > ------------------------------------------------------------------------------ > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev > _______________________________________________ > Dmtcp-forum mailing list > Dmt...@li... <mailto:Dmt...@li...> > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > > -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder Office FAX: 303-415-9702 3380 Mitchell Lane or...@nw... Boulder, CO 80301 http://www.nwra.com |
From: Kapil A. <ka...@cc...> - 2012-10-15 19:14:16
|
Ohh, the SIGUSR2 is used internally. The checkpoint-manager thread sends SIGUSR2 to the user threads after doing a bunch of procedures. In order to checkpoint the process, you need to use "dmtcp_command [-p <coordinator-port>] -c". Alternatively, you can give the coordinator "c" command. On Mon, Oct 15, 2012 at 10:50 AM, Orion Poplawski <or...@co...>wrote: > I thought sending USR2 to a dmtcp started process would trigger a > checkpoint. I'm not seeing that happen with this test. > > > On 10/15/2012 04:36 AM, Kapil Arya wrote: > >> Hi Orion, >> >> DMTCP uses SIGUSR2 internally for checkpointing purposes. If you want to >> change this behavior, you can use the --mtcp-checkpoint-signal. >> >> Kapil >> >> On Wed, Oct 10, 2012 at 1:06 PM, Orion Poplawski <or...@co... >> <mailto:or...@co...>> wrote: >> >> So I started a bash script via dmtcp_checkoint and I now have: >> >> orion 23499 23489 0 10:43 ? 00:00:00 /bin/bash >> /var/spool/gridengine/castor/**job_scripts/27685 ./foo 7200 out.qsub >> orion 23506 1 0 10:43 ? 00:00:00 >> /usr/bin/dmtcp_coordinator >> --port 0 --exit-on-last --interval 0 --background >> orion 23512 23499 0 10:43 ? 00:00:00 ./foo 7200 out.qsub >> >> If I send the USR2 signal with kill -USR2 23499 it doesn't do >> anything. Is >> perhaps bash blocking the signal? There are no trap statements in >> the script. >> >> -- >> Orion Poplawski >> Technical Manager 303-415-9701 x222 >> NWRA, Boulder Office FAX: 303-415-9702 >> 3380 Mitchell Lane or...@nw... <mailto:or...@nw...> >> >> Boulder, CO 80301 http://www.nwra.com >> >> ------------------------------**------------------------------** >> ------------------ >> Don't let slow site performance ruin your business. Deploy New Relic >> APM >> Deploy New Relic app performance management and know exactly >> what is happening inside your Ruby, Python, PHP, Java, and .NET app >> Try New Relic at no cost today and get our sweet Data Nerd shirt too! >> http://p.sf.net/sfu/newrelic-**dev2dev<http://p.sf.net/sfu/newrelic-dev2dev> >> ______________________________**_________________ >> Dmtcp-forum mailing list >> Dmt...@li...urceforge.**net<Dmt...@li...><mailto: >> Dmtcp-forum@lists.**sourceforge.net <Dmt...@li...>> >> https://lists.sourceforge.net/**lists/listinfo/dmtcp-forum<https://lists.sourceforge.net/lists/listinfo/dmtcp-forum> >> >> >> > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA, Boulder Office FAX: 303-415-9702 > 3380 Mitchell Lane or...@nw... > Boulder, CO 80301 http://www.nwra.com > |