Re: [Bigdata-developers] Why zookeeper?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Fred,

> The term 
> "logical service" seems overloaded in that (as far as I have 
> figured out) it has different meanings in the pre-HA and 
> post-HA discussions.  

It is the same usage.  The pre-HA code base was designed with the HA feature set in mind.  A logical service corresponds to some collection of actual service instances which provide the highly available logical service.

I get that you are not fond of the rules-based scheme.  What I would like to know is how HA will be handled within the scheme that you are proposing.

Thanks,
Bryan

> -----Original Message-----
> From: Fred Oliver [mailto:fko...@gm...] 
> Sent: Monday, July 26, 2010 6:05 PM
> To: Bryan Thompson
> Cc: Bigdata Developers
> Subject: Re: [Bigdata-developers] Why zookeeper?
> 
> On Mon, Jul 26, 2010 at 2:37 PM, Bryan Thompson 
> <br...@sy...> wrote:
> > I've been out for a bit with my head wrapped around other 
> things.  Can you remind me how we are going to handle the 
> assignment of physical service nodes to logical service nodes 
> in this design?
> 
> What is a node (physical or logical) in this context? I think 
> you mean that a physical node is a machine. If so, then what 
> is a logical node?
> 
> As I understand the services, all service instances are 
> physical. The logical service construct exists only as an 
> abstraction on which the rules in the rules based 
> specification may operate. If I understand correctly, then 
> with the instance level specification, there are no logical 
> services and no logical nodes. But still, what's a logical node?
> 
> > Concerning your points below, either scheme can be made 
> fully deterministic.  It is only a matter of specifying that 
> a specific service must run on a specific host (a constraint 
> on what services can run on a given host).
> 
> If you can make the rules based scheme deterministic, then please do!
> But I think meant only that you can write rules that 
> constrain the behavior such that the result in those 
> particular cases are deterministic, which is an entirely 
> different matter. The rules based scheme makes for much more 
> code, locking and synchronization, very difficult testing, 
> and a less maintainable environment.
> 
> > I see rules based specification as more flexible because 
> you can administer the rule set rather than making a decision 
> for each node that you bring online.  I agree that it is more 
> adaptive since the constraints are being specified at a level 
> above the individual node. I see the rules as globally 
> transparent because they are just data which could be edited 
> using, for example, a web browser backed by an application 
> looking at the data in zookeeper where as the instance level 
> specification must be edited on each node.  I think of rules 
> as more scalable because you do not have to figure out what 
> you are going to do with each node.  The node will be put to 
> a purpose for which it is fit and for which there is a need.
> 
> I think we're going to disagree about merits of instance vs. 
> rules schemes, and I hope we can modularize the system so 
> that the schemes are separate modules and independent of the 
> core functionality (which wouldn't need zookeeper).
> 
> My biggest concern about that last paragraph (or the whole 
> message?) is that this use of zookeeper seemed  unnecessary 
> and confusing. That is, why wouldn't the web app interact 
> with the service instances directly to get/set configurations 
> using well defined, testable public interfaces, rather than 
> use zookeeper as a hub? (That's the secret messages in dead 
> drops thing.)
> 
> > However, as long as we have a reasonable path for HA 
> service allocation which respects the need to associate 
> specific physical service instances with specific logical 
> service instances then it seems reasonable that either 
> approach could be used.  It just becomes a matter of how we 
> describe what services the node will start and whether or not 
> we run the ServicesManagerService on that node.
> 
> Clearly HA needs set of like service instances to work in 
> active/active or active/passive arrangements. The term 
> "logical service" seems overloaded in that (as far as I have 
> figured out) it has different meanings in the pre-HA and 
> post-HA discussions. I can see that an HA logical data 
> service would refer to the group of data service instances 
> which together host a single shard. But this definition is 
> very specific and differs from the more general meaning in 
> the rules based specification discussion, which is confusing.
> 
> The instance-based scheme can be used for HA as well as long 
> as the service configurations are extended to indicate which 
> "HA logical"
> group a service belonged to.
> 
> Fred
> 
> > PS: Concerning "flex", the big leverage for flexing the 
> cluster will come with a shared disk architecture (rather 
> than the present shared nothing architecture).  Using a 
> shared disk architecture, the nodes can then be started or 
> stopped without regard to the persistent state, which would 
> be on managed storage.  That would make it possible to 
> tradeoff dynamically which nodes were assigned to which 
> application, where the application might be bigdata, hadoop, 
> etc.  In that kind of scenario I find it difficult to imagine 
> that an operator will be in the loop when a node is torn down 
> and then repurposed to a different application.  However, 
> this kind of "flex" is outside the scope of the current effort.
> 
> OK. I see the primary benefit of this arrangement as making 
> hot spares become operational much more quickly, but I don't 
> see how this applies to the rules vs. instance based 
> specifications discussion. Both schemes can handle this arrangement.
> 
> Fred
> 

Re: [Bigdata-developers] Why zookeeper?

Fast, scalable, robust graph database platform

Re: [Bigdata-developers] Why zookeeper?