[Saturn-devel] proposal for needed functionality
Status: Beta
Brought to you by:
mazzabr
|
From: Christian S. <Chr...@gm...> - 2005-04-15 05:57:17
|
hello list,
sorry for the head-start that follows. i am considering a similar
project for quite some time but i would like to suggest a slightly
different direction and a wider scope.
i would have the need for a not only scheduler for a cluster of machines
where on every machine there is such a scheduler and all of them can be
centrally administrated. besides that high reliability would be a must
for such a service.
i would say such a service should have roughly the following parts:
- central configuration for the whole cluster of machines. the machines
in the cluster belong to a certain type and you specify the "crontab" of
types of machines. this configuration should allow for some parameters
like ${hostname} or similar.
- the scheduler itself. but it should not only provide the scheduling of
for starting tasks, but aswell to set timouts on which you can configure
alternative tasks to run or error messages to be sent. e.g. if a task is
triggered by the arrival of a file you should be able to say that the
file has to arrive every day until 22:00 o'clock or this is an error
condition.
- the tasks to execute. these tasks are "functionality" that take
arguments like the time of execution or the filename of the file which
just arrived and triggered the execution of this task. it is necessary
that these parameters are arguments to the task and that you don't
discover them at run-time because of the following point:
- a local and a remote entry point (you should be able to trigger every
task by hand with arguments either locally from the commandline or
remotely via a network protocol)
- logging of info and error messages and
- a system that reacts on error events and autonomously starts
"recovery" tasks
perhaps the system should even contain a reliable file transfer
mechanism to feed the results of one task on one machine into another
task on another machine in the cluster.
i personally would suggest to use java as the implementation basis for
such a cron replacement, because then it will be easily portable.
besides that i would suggest to describe tasks as ant tasks in xml
files. there is already a huge base of predefined ant tasks which are
ideal for most system maintenance activities and besides that ant has
already proven to run on all kinds of different platforms. in addition
you should be able to write tasks directly in java if they cannot be
expressed as ant tasks.
java supports all kinds of remote network interfaces like rmi, corba,
soap, jini, ... which you could use to manually trigger tasks on any
machine of the cluster. in this remote scenario you would have to
provide some sort of authentication scheme so that you can restrict the
access to your defined tasks.
the central configuration for the tasks and possible the logging would
come from or go to a database and there are jdbc drivers for basically
all major databases out there.
in your description of what saturn should be it is written "Job
Scheduler for Networks. Control local and remote job scheduling through
simple commands". i am not sure if that means that you have a central
scheduler which then triggers commands on remote machines or that you
have schedulers on all machines of the cluster which can be
administrated from a central point? i guess version 2, so that even if
you don't have networking available the scheduler can do its job.
because of the possibility of network failures the logging mechanism
should allow for queuing messages so that when the network comes back
the messages can be sent to the central collection point of all logging
messages.
if you could agree to most of the points above then you have an
additional developer for your project.
thanks,
--
Christian Schuhegger
http://www.el-chef.de/
|