[Queue-developers] RE: no luck with latest development Queue
Brought to you by:
wkrebs
From: Hazelrig, C. C. (C. - Simtech)
<Chr...@hw...> - 2001-04-18 14:48:20
|
Hi, Monica. I left queue_manager, queued, and task_manager running over night and now host1 and host2 are listed in the status file under DEFECTIVE_SERVERS. I guess I'll jump into the debugger and try to determine why the two don't seem to be communicating properly. It's weird, though, that queue_manager isn't detecting host1 since that's the node it's running on. I get the same results when attempting to submit a job to host1 as with host2 yesterday. Chris > -----Original Message----- > Date: Tue, 17 Apr 2001 22:42:05 -0700 (PDT) > From: Monica Lau <la...@cs...> > To: que...@li... > Subject: Re: [Queue-developers] no luck with latest development Queue > Reply-To: que...@li... > > Hi Chris, > > When the queue_manager first starts up, the hosts are in the VALIDHOSTS > list. The queued's must periodically send update messages to the > queue_manager; the period between these update messages are defined in the > queue_define.h file (MAX_MODULO and MIN_MODULO). When the queue_manager > hears from these queued's, then it will move these hosts from the > VALIDHOSTS list to the AVAILHOSTS list. So, just wait a bit before > submitting your jobs. > > I hope this helps. > > Regards, > > Monica > > > On Tue, 17 Apr 2001, Hazelrig, Chris C. (Contractor - Simtech) wrote: > > > Greetings, > > > > Having a few problems with latest Queue development version. I'm trying > it > > with just two nodes (host1, host2), each running RedHat Linux 6.2. On > the > > master (host1), I am running queued (queued --debug --foreground), > > queue_manager, and task_manager. On the slave (host2), I am running > queued > > (queued --debug --foreground) and task_manager. From host1 I execute > the > > following: > > > > queue -D -i -w -n -a dummylicense -H host2 -- hostname > > > > I get the following error message: > > > > Queue.c Error: no |'s allowed in Queue software > > > > If I remove the -H option OR the -D option, no error is reported, but no > > result is reported either. The command seems to go off into the weeds > and > > never comes back. The status file says host1 and host2 are VALIDHOSTS > but > > lists nothing as AVAILHOSTS, and the submitted job is listed under > > HIGH_WAITING. It appears that queue_manager thinks both nodes are busy. > > Upon submitting the job, queue_manager reports "After getting licenses" > and > > then "After getting user's environment" and then "Timed out". > Attempting to > > kill the job with ^C returns to the command line, but doesn't actually > kill > > the job, it is still listed in the status file. Using task_control -k > > <JOB_ID> does kill it, and queue returns the following message: > > > > Queue.c Error: did not get an assigned host > > > > I'm stumped. Any thoughts? > > > > Thanks in advance, > > Chris > > > > _______________________________________________ > > Queue-developers mailing list Que...@li... > > To unsubscribe, subscribe, or set options: > > http://lists.sourceforge.net/lists/listinfo/queue-developers > > > > > > > --__--__-- > > _______________________________________________ > Queue-developers mailing list > Que...@li... > http://lists.sourceforge.net/lists/listinfo/queue-developers > > > End of Queue-developers Digest |