Re: [Queue-developers] Feedback requested on detailed plans and code for contrib project

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Fri, 15 Sep 2000, Monica Lau wrote:

> Someone also brought up the idea of a back-up queue_manager.  There
> would be a master queue_manager and a slave queue_manager.  The slave
> queue_manager has all the information that the master queue_manager has,
> so if the slave ever detects that the master has failed, it will take
> over the master's role until the master starts back up again.  Is this
> similar to what you have in mind?

I think it would do the trick.  I had been thinking of just some system 
that would rank all machines in the cluster for taking over the 
queue_manager but perhaps that's overkill.

A more general concern that's not specific to your system but may be 
relevant to the discussion is just how to avoid machines becoming 
specialized in a cluster.  We have an imperfectly clustered system in my 
setup and I always have to remember, when I take down a machine, which 
services I have to fail over to another machine before I take it down (or 
after it crashes).  License managers are of course a particular problem 
to fail over since they often depend on hardware checks to make sure 
they're running on the designated machine, and I'm curious how you deal 
with that or would deal with it (i.e., please indulge my laziness for 
not reading the documentation carefully enough if it's in there).  So 
just it would be nice to not have queue be part of the solution and not 
another such specialized service but maybe this is only a pet peeve and 
not a big problem.

Also, on a cluster of say 50 or 100 machines, having one backup might 
not be enough.  But then I have no idea how well queue works on clusters 
that large.  (Is anyone out there running such a thing?)  Perhaps things 
like sub-clusters are more basic to such a problem than worrying about 
having enough backups.

Anyway, sorry for thinking out loud on a large list.  Great work.

Cheers,
Tavis