|
From: Bryan T. <br...@sy...> - 2010-08-03 13:13:11
|
Fred, I also do not see how this service can be stateless. It needs to know how many instances of the logical services exist, how many need to be created, and it needs to bind physical instances to logical instances in a manner which protects against network partitions. That assignment really needs to be either static or governed by zookeeper in order to ensure that we do not assign too many instances to a quorum. If you have more than the target number of instances for a quorum, then you have broken the simple majority semantics. Handling this becomes the same as the problem of handling hot spare allocation, which is more than I think you want to get into right now. So, I think that we need to pair your proposal for instance level configuration with static configuration and static binding of physical services to logical instances. Bryan > -----Original Message----- > From: Fred Oliver [mailto:fko...@gm...] > Sent: Monday, August 02, 2010 6:39 PM > To: Bryan Thompson > Cc: Bigdata Developers > Subject: Re: [Bigdata-developers] Why zookeeper? > > Bryan, > > > You are positing a new service which handles the binding of > the available physical services to the required logical > services. How do you plan to make that logical to physical > binding service highly available? It seems to me that this > centralizes an aspect of the distributed decision making > which is currently performed by the ServicesManagerServer. > If you make this binding service highly available, will you > have recreated a distributed decision making service? It > would of course be a simpler service since it does not handle > service start decisions, but only service bind decisions. > > The new (simple, stateless) service I proposed in passing is > useful only when new unbound (physical) data services are > added to the HA cluster. Once new physical data services are > bound into logical data services, this new service has no > further useful function and can be shutdown. It does not need > to be highly available. > > > Clients in this context refers to (a) ClientServices > participating in a bulk load; (b) nodes providing SPARQL end > points; and (c) the DataServices themselves, which use the > same discovery mechanisms to locate other DataServices when > moving shards or executing pipeline joins. The hosts which > originate SPARQL requests are HTTP clients, but they are not > bigdata federation clients and do not have any visibility > into zookeeper, jini, or the bigdata services. > > > Given that "clients" are themselves bigdata services and > that Zookeeper scales nicely to 1000s of nodes, why have > clients go directly to the services rather than watching the > quorum state in zookeeper? > > Removing one of two (mostly) redundant discovery mechanisms > reduces code complexity. The River service discovery manager, > already in use, scales even better to 1000s of nodes because > persistence isn't needed and the redundant copies don't need > to cooperate. > > After discovery, why not go directly to the service to ask > for state if necessary? It is the same cost as going to > zookeeper, right? The benefit is that going direct to the > service involves an easily documented testable, maintainable > interface, and any state being persisted is persisted by the service. > > Other than general service startup and HA group startup, what > information is passed through zookeeper? > > Fred > |