Re: [bailey-developers] SF.net SVN: bailey: [17] trunk/src

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ning Li wrote:
> I thought only the master writes to Zookeeper. Hosts read from Zookeeper.
> We should still have a light-weight master, no? Which keeps track of
> the heartbeats from the hosts, detects host failures and responds
> accordingly, and decides state changes (add/remove nodes) when
> appropriate...

As a thought experiment, I'm trying to see whether, with Zookeeper, we 
actually need such a master.

We could replace heartbeats with an ephemeral file per node containing 
its status.  (Ephemeral files disappear if their owner goes offline.)

Any host could (a) grab a lock; (b) analyze the ring for potential 
add/removes; (c) post these requests to a directory.  In effect, getting 
the lock is master election, and while a node is doing this analysis, it 
is the master.  But the master moves around: each host has a "master" 
thread, and remains "master" only during one analysis/action cycle. 
This analysis cycle should not be very compute or i/o intensive.

The advantage of this is that we wouldn't have to explicitly test or 
otherwise engineer master failover, since it would be happening all the 
time.  All global state would be redundantly stored in Zookeeper.

Could this work?

> In this example, say B synced with A to A's entry 11 and
> with C to C's entry 12 (which includes adding doc X) before
> it went offline. Because A expunged its log, now A's log
> entry number starts from 21. B comes back online and
> finds A's 21 is newer than last time B synced with it (11).
> So B cannot recover but has to sync from scratch.

My concern was that A might try to sync from B and get stale data.  I 
think you're arguing that B should not go online, accepting sync 
requests, until it has itself sync'd with its neighbors.  In this case, 
incremental sync would fail and B would sync from scratch, removing any 
stale adds.

How does this work at system startup?  How does B know not to go online, 
providing its stale adds, yet A is permitted to provide its data? 
Perhaps, at startup a node can respond to requests for its first log 
entry number, but refuse requests for the logs themselves.  Then B can 
see that it must reload, and refuse requests from A to sync until reload 
is complete.  Does that work?

Doug