Re: [bailey-developers] lattice master

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Fri, Feb 29, 2008 at 11:23 AM, Ning Li <nin...@gm...> wrote:
> On Fri, Feb 29, 2008 at 9:08 AM, Yonik Seeley <yo...@ap...> wrote:
>  > On Thu, Feb 28, 2008 at 5:37 PM, Doug Cutting <cu...@ap...> wrote:
>  >  >  I'd imagined each node periodically querying its neighbors for changes
>  >  >  in the range they share.  We shouldn't rely on clock synchronization, so
>  >  >  each node would keep the last revision of each neighbor that it has
>  >  >  sync'd with.  So, the first time they connect, they pass revision zero
>  >  >  and receive all updates for their overlap.  The next time they only need
>  >  >  to retrieve updates since the last.
>  >
>  >  Sounds good!
>  >
>  >  We'll need to make sure that it handles this case:
>  >  Say C syncs from B, then B syncs from A and get some new docs.  Now A
>  >  goes away for a while... it seems like C will need to get the new docs
>  >  from B next time it syncs.  If the docs that B got from A are treated
>  >  like any other new additions, this should work right?
>
>  I think this is how it works with replication=3:
>  Node A serves range A-D, node B serves range B-E, node C serves range C-F...
>  Node C syncs with nodes A & B on range C-D, syncs with nodes B and D
>  on range D-E, and syncs with nodes D & E on range E-F.
>
>  Now node D goes away. Node C continues to serve range C-F until it gets
>  range F-G from node E or node F and starts to serve range C-G. Then,
>  node C syncs with nodes A & B on range C-D (same as before), syncs with
>  nodes B and E on range D-F, and syncs with nodes E & F on range F-G.

I see the following as a possible implementation:

Each node logs its changes in a light-weight log. A log is a sequence
of log entries, with the entry number monotonically increasing. Let's
say node X periodically syncs with node Y. A sync works as follows:
Node X starts with entry p in node Y's log where node X synced with
node Y last time. After node X finishes syncing (node X puts actual
changes into its own log), it sets its sync entry w/ node Y to entry q.

When node D goes away, node C copies range F-G from node E or
node F, let's say node E. For the copy, node E's own log is at entry s
and node E is synced with node F at node F's entry t. Node C does
the copy but does not log the copy. After node C finishes the copy
and before it switches from serving range C-F to serving range C-G,
it sets its sync entry with node E to min(its current sync entry with
node E, entry s) and sets its sync entry with node F to entry t.

Is this efficient enough?

Ning