|
From: Bryan T. <br...@sy...> - 2010-06-24 15:27:08
|
Brian, No, I was confused. I did not see zookeeper and had assumed that it was not running. Hence the rest of my questions. I do think that we should schedule a call to talk about how services will be started and restarted because this all interacts with the HA quorum logic. For example, hot spare recruitment, the target replication factor and the actual replication factor for a highly available service all interact. The logic for starting those services therefore has to coordinate with the HA logic. The quorums depend on having a simple majority. This is built around a service replication factor, k. k is an odd positive integer. k:=1 is not highly available. k:=3 is highly available and there must be a minimum of (k+1)/2 = 2 services running for the quorum to meet. If we start more than k services, then this can break the quorum logic. Right now I have presumed a dependency on zookeeper and the existing services manager service to provide a distributed guarantee that we start exactly k services. Planned down time and hot spare recruitment are both tricky issues for HA. We have to actually annotate the service when it is brought down, e.g., for a rolling code base update, to prevent it being treated as a failure and having a hot spare automatically recruited. Likewise, we have to pay careful attention when a hot spare is recruited to how it joins the write replication pipeline and when it joins the quorum. If we follow a path where service start is not linked to the configuration information in zookeeper and the service management services, then this is all stuff that we need to work through together. I think that we should do this soon -- ideally before I proceed with the zookeeper quorum integration based on the existing design. I'd be happy to talk through the quorum design on the call as well. Maybe we can do this in three pieces. One on the quorum work that I have been doing, one on the deploy/config work that you have been doing, and then an open discussion on how these things could be used to provide the flexibility and high availability and how they interact with hot spare recruitment. Thanks, Bryan ________________________________ From: Brian Murphy [mailto:btm...@gm...] Sent: Thursday, June 24, 2010 11:07 AM To: big...@li... Subject: Re: [Bigdata-developers] Alternate install/deploy mechanism On Wed, Jun 23, 2010 at 8:23 PM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote: Right now, bigdata depends on leader election semantics from zookeeper to start the appropriate mixture of services. I did not see zookeeper running so I presume that you are handling that differently in this example. No, zookeeper was running. If you run the disco-tool (or a jini browser), you should see a service of type com.bigdata.service.QuorumPeerService; which is zookeeper wrapped in a Jini service. Wrapping zookeeper in a Jini service not only provides a means to more easily start and stop zookeeper, but also provides a means to dynamically discover zookeeper in the federation. Furthermore, the QuorumPeerService interface provides a mechanism to customize how the services interact with zookeeper if desired. I would like to understand how we would handle the distributed decision making necessary to start an appropriate mixture of services with this proposal and also how we would handle the distributed decision making required to support the HA quorums. I've attached an updated version of my draft for the HA quorum design and the proposed zookeeper integration. Rather than using zookeeper to decide what gets started, this mechanism allows one to configure what individual services get started where, including the appropriate number of zookeeper instances. Zookeeper would then be viewed as a discoverable resource that can be used by the other services to determine who is the leader and whether or not a quorum exists before those services are used. I realize that some jini implementation do provide capabilities similar to what zookeeper provides. I'm not sure what jini implementations you're talking about. Something not in the Jini starter kit? Are you suggesting that or did you simply leave zookeeper and its roles in configuration management, leader elections, etc. out of the demo? As I said above, zookeeper was not left out. But I also said in my original posting that this work is not anywhere near complete, and was posted to give folks an idea of what could be done with install and deployment if the services are re-implemented to a smart proxy model and move to a shared nothing architecture; all of which I believe will be a significant amount of work. Perhaps in the future I should hold off on posting until the work is more complete. Sorry if I caused confusion. BrianM |