From: gene <ge...@cc...> - 2013-10-28 19:40:02
|
> Perhaps this is something that is handled by the Torque plugin? Yes, that's correct. You'll need to use the DMTCP plugin for Torque. Artem Polyakov is supporting that, and I'm cc'ing to him. Among other issues, mount points can change and network addresses can change on restart. The plugin tries to handle that. Please let us know if you have any trouble using the Torque plugin. Best, - Gene On Mon, Oct 28, 2013 at 03:10:51PM -0400, Bryan F Putnam wrote: > > Dear DMTCP developers, > > I've found that when restarting a multi-node job, dmtcp_restart only appears to be aware of the local host. Is it possible to tell dmtcp_restart which hosts are currently available for a job restart, whether it's the same set of multiple hosts, or a completely different set of hosts? > > Typically our hosts are contained in $PBS_NODEFILE since we use Torque. Perhaps this is something that is handled by the Torque plugin? > > Thanks, > Bryan > > -- > Bryan Putnam > Senior Scientific Applications Analyst > Rosen Center for Advanced Computing, Purdue University > Young Hall (Rm. 910) > 155 S. Grant St. > West Lafayette, IN 47907-2114 > Ph 765-496-8225 Fax 765-496-2275 > bf...@pu... > www.purdue.edu/itap |