Re: [Clockwork-developers] Architecture of job scheduler
Status: Planning
Brought to you by:
jlouder
|
From: Charles B. H. <br...@do...> - 2002-01-11 19:05:22
|
----- Original Message ----- From: "Shawn McMahon" <smc...@ei...> To: <clo...@li...> Sent: Friday, January 11, 2002 10:03 AM Subject: Re: [Clockwork-developers] Architecture of job scheduler > Another concern would be clocks; it's easier to keep one machine synced > than dozens. A valid question may be whether our tool should be responsible for time synchronization. The users could always use NTP to keep the clocks on their systems synchronized (to a certain degree). NTP can generally keep system clocks within a second of each other, so the question is whether it would be valuable to keep system clocks more closely synchronized than is possible with the standard tools like NTP. As we're talking mostly about batch processing, it seems relatively unlikely to me that I would run into a situation where I need to start jobs on multiple systems with that much accuracy. Even supposing that were the case, the developer would likely want to be using real-time programming techniques, which would make the use of a scheduler like we're discussing out of the question. In a decentralized configuration like Joel suggested, I'm thinking it would probably be enough to have the servers start jobs according to their own system clocks, and let the system administrators worry about keeping the system clocks as closely synchronized as they need. Many routers participate in NTP time synchronization, and I'd guess that most large server network installations are configured for NTP as well. (I even run a server at my house to keep all of my PCs' clocks in sync.) > > event processor (to borrow a term from AutoSys). And a SQL-based database > > would make things easier to work with from a development perspective, but > > not until now did I realize that it might make the system less attractive > > for a user, since they would have to manage another database. > > No reason we can't make the database a part of the server program, and > not use a full-blown SQL, is there? That's certainly an option, but if we could achieve the same, or even better, performance (one of Autosys' disadvantages) without having a centralized server to depend on and without having a full-fledged SQL database that humans have to manage, I'd be all for it. If we choose to use a full SQL database, we have two possible routes: either choose a database that everyone will have to do, or decide to support multiple database platforms. If I wanted to use a full-fledged database, I would want to take advantage of some of the more advanced features of the database engine (let's say we had one that supports transactions, replication, triggers, etc). To use those features we'd need to go with a single database platform. Trying to use, for example, JDBC, and let the user choose the database platform would mean that we wouldn't be able to use features that aren't commonly supported. If we go centralized, I think we might as well pick a full-fledged SQL database platform (hopefully a free one) and standardize on that. A centralized system means we're dependent on a single server, and if that's the case, that server is liable to be very, very busy. If we wanted to have redundant servers be an option, we'd really need some database replication to do it right, so there's really no alternative to a real database. However, if we decide to decentralize the application, perhaps we should investigate using something smaller, like maybe Berkeley DB, that may not be SQL, but might have enough functionality to manage the jobs for a single machine, and do a good job at it. Alternatively, we could choose to store the data in XML, and load it into the data structures we're using in whatever language(s) we choose. Or, if we could find an SQL engine that doesn't require human management and is small enough to include with builds of our application, that would be cool too. Just my $0.02 -Brian |