Re: [bailey-developers] SF.net SVN: bailey: [17] trunk/src

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Thu, Apr 10, 2008 at 7:42 PM, Doug Cutting <cu...@ap...> wrote:
>  So, with these two simple properties, lightweight master failover and
>  ability to run the master on a host, means that we can choose to either
>  dedicate a machine to the master, or run the master daemon on every
>  host, switching frequently.  I'd certainly like to preserve the latter
>  option, so lets keep it in mind as we code.

Sounds good. I like the design. :)

>   From A's perspective, when B comes online, B's data looks fine, since
>  B's log is complete.  But B discovers, when it talks to A, that B's data
>  is obsolete.  If A retrieves B's data before B discovers that it is
>  obsolete, then A would get stale adds.  So B must block retrieval of its
>  data until it has determined whether it is valid to all of its
>  neighbors.  At system startup this means that all nodes must wait for
>  their neighbors to come online so that they can determine whether their
>  own state is valid before permitting any synchronization.
>
>  Does that make any sense?

Yep. My point was also that a node needs to check the neighbors to
decide the validity.

>  I think each node can determine that on its own at startup.  It
>
>  1. Posts its range and log start number.
>  2. Waits a bit, so all other nodes have had a chance to post their data.
>  3. Decides if its index is valid, by checking all overlapping node's log
>  start numbers & compares them with the last sync'd log number to see if
>  they've compacted their log.
>  4. Starts syncing with its neighbors.

Data in Zookeeper is persistent, right? So there are records in Zookeeper
on the range, log start number and log start numbers on neighbors
of a node. So what does it mean if, at startup, a node's posted numbers
are newer or older than those in Zookeeper? Can the data in Zookeeper
help during startup? Or do we discard those records in Zookeeper?

>  I worry that there might be pathological cases where different nodes
>  were offline and/or compacted at different times causing all replicas of
>  a range to be discarded.

:) Hopefully this won't happen because of replication. In addition, we
only throw away/re-build a node when we know we can re-build its range
from some other nodes, right?

Ning