From: Robert E. <pa...@tu...> - 2004-02-10 06:46:44
|
Hi, all - I've got some preliminary support checked in for "daughterships". It's incomplete (there are some tricky issues left), but it's there to provoke discussion. The changes shouldn't affect any other uses. Recalling the observed problem, in very large-scale environments, it can take appreciably long for each server involved to contact the mothership and get its configuration. A "daughtership" (perhaps a better analogy would be a "surrogate mothership") relieves some of the pressure by providing an additional "contact point". The daughtership contacts the mothership to collect the entire configuration; from that point, servers may contact the daughtership instead of the mothership to get their own configuration details. The daughtership can tell the servers how to configure and what to do just as well as the mothership can. Multiple daughterships are supported; a hierarchy of daughterships is also possible (as each daughtership, after getting details from its mother, is a fully capable mothership). To start a daughtership: % export CRMOTHER=host:port # where is my mother? % export CRMOTHERSHIP=host:port # what "mothership" should I look like? % python .../cr/mothership/server/daughtership.py After that, any other server may be run with CRMOTHERSHIP pointing at the daughtership or at the original mothership, equivalently. What doesn't work: - Autostart. There are no node types (yet?) for daughterships, so they cannot be autostarted. - Dynamic hosts. A daughtership cannot resolve a dynamic host itself; it must contact its mother (otherwise, multiple daughterships may all resolve the same dynamic host to different actual hostnames, which will cause them to not be equivalent any more). But a daughtership in the middle of a connection request (which can trigger dynamic host matching) can't easily ask the mothership for information; it can send the request easily enough, but it can't guarantee that the next information coming from its mother is the correct response (because the mother propagates some commands to its daughterships). The logic to handle this gets spotty after that; you end up having to suspend the connection request and return to the communications loop, waiting until the proper data comes in; further, we lose the advantage of having a daughtership if there are many dynamic hosts... Providing nodes for daughterships and enforcing a strict hierarchy (i.e. servers may only contact their designated daughterships) would solve both problems, but limit flexibility, introduce complexity (especially with the graphical config tool), and possibly introduce confusion (if a dynamically-hosted crserver points to the wrong mothership by accident, it could still resolve, but ultimately leave one daughtership not knowing what to do with an extra connection, and another one waiting for a missing connection). I think it would be easier to code for the former (the daughtership contacts the mothership for dynamic host matching) than for the latter (nodes & hierarchy). But the whole thing may be complex enough to not be worth the effort; perhaps these changes should be abandoned, and another mechanism for large clusters considered... Thoughts? Bob Ellison Tungsten Graphics, Inc. |