Re: [Bigdata-developers] Why zookeeper?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Fred,

Zookeeper is used by the ServiceManagerServices (SMS) to make shared decisions about which nodes will start which services.  The configuration information for the services includes optional constraints on the nodes on which they can run, on the services which must be running before they can start (and I agree with you that services should start without such preconditions and await the appropriate events), etc.

Each SMS enters into a queue for each logical service type registered against zookeeper.  When an SMS is at the head of that queue it has the "lock" (this is the zookeeper lock pattern).  It then decides whether or not it can start that service type given the nodes capabilities and the constraints (if any) for that logical service type.  If it can, it starts the service.  This decision making process needs to be globally atomic to prevent nodes from concurrent starts of the same service on different nodes.  

Bryan

> -----Original Message-----
> From: Fred Oliver [mailto:fko...@gm...] 
> Sent: Tuesday, July 20, 2010 1:20 PM
> To: Bryan Thompson
> Cc: Bigdata Developers
> Subject: Re: [Bigdata-developers] Why zookeeper?
> 
> On Tue, Jul 20, 2010 at 12:38 PM, Bryan Thompson 
> <br...@sy...> wrote:
> > Fred,
> >
> > We should have a separate discussion concerning how bigdata 
> allocates and start services.  I am a bit crushed for time 
> right now, but maybe we could take that up next week?
> 
> I understand you'll be unavailable. Please pick up the 
> conversation when you can.
> 
> > We only use global synchronous locks at the moment in 
> service startup logic, HA, locking out different masters for 
> the same distributed job.  However, I think that specific 
> applications of bigdata may well want to use global 
> synchronous locks to make operations such as life cycle 
> management of a triple or quads store atomic.
> >
> > Concerning your example, service #5 is either running or it 
> is not, but we also need to know whether or not it has been 
> created.  Bigdata services have huge amounts of persistent 
> state.  We can not simply substitute another service as a 
> replacement for #5 if it should fail.  Instead we have to 
> recruit a new service, synchronize it with the state of the 
> quorum to which #5 belonged, and then bring the new service 
> on atomically when it is caught up with the quorum.  This 
> "hot spare" allocation process could take hours to bring a 
> newly recruited node into full synchronization.  We speed 
> that up by working backwards in history so the data service 
> quickly has a view of the current commit point and then 
> builds up its history over time.
> 
> I specifically meant to disregard HA for this conversation, 
> as bigdata has a long history with zookeeper outside of HA. 
> HA is a much longer conversation, and I don't want to confuse 
> tomorrow's issues with today's code.
> 
> With that said, I don't understand what "whether or not it 
> has been created" means. DataService#5 could be called 
> "created" when the operator wrote (DataService, host H, 
> persistence directory D) in a config file on host H and 
> created directory D with #5 in it. After that, the service is 
> either running, or not running.
> 
> Please explain where global synchronous locks are needed in 
> service startup logic?
> 
> Fred
> 
> >
> > Bryan
> >
> >> -----Original Message-----
> >> From: Fred Oliver [mailto:fko...@gm...]
> >> Sent: Tuesday, July 20, 2010 10:59 AM
> >> To: Bryan Thompson
> >> Cc: Bigdata Developers
> >> Subject: Re: [Bigdata-developers] Why zookeeper?
> >>
> >> On Tue, Jul 20, 2010 at 9:24 AM, Bryan Thompson <br...@sy...> 
> >> wrote:
> >>
> >> > In this regard it is more flexible than an Jini system with
> >> support for creating global synchronous locks.
> >>
> >> I believe that global synchronous locks are more harmful than 
> >> helpful, in general. That's why I would like to know how global 
> >> synchronous locks help bigdata (not that zookeeper is a 
> bad way to do 
> >> it if necessary).
> >>
> >> > Services store their configuration state locally for
> >> restart in the service directory.  However, we also need to know 
> >> which persistent services exist, even if they are not 
> running at the 
> >> moment. That information is captured in zookeeper.  For 
> example, if 
> >> the target #of logical data services (LDS) is 10, then we 
> do not want 
> >> to start a new data service if one goes down because the 
> persistent 
> >> state of the data service can be terabytes of files on local disks 
> >> and is part of the semantics of its service "identity".
> >>
> >> I'm not clear about the problem you are describing. Say we have a 
> >> DataService #5 configured one host H with a persistence 
> directory D 
> >> containing its UUID, journals, indices, etc. It is either 
> running and 
> >> registered with the lookup service (with the UUID) or not. If the 
> >> service starter/manager on host H needs this service to be running 
> >> and is not registered, start it. (There is a strictly 
> local problem 
> >> of starting a duplicate java process because of race 
> conditions, but 
> >> the service itself should detect and prevent
> >> that.) I don't see how global synchronization is involved here.
> >>
> >> Can you give another example of the need for global 
> synchronization 
> >> (excluding HA) or point out what I am missing?
> >>
> >> Fred
> >>
> 

Re: [Bigdata-developers] Why zookeeper?

Fast, scalable, robust graph database platform

Re: [Bigdata-developers] Why zookeeper?