Re: [bailey-developers] SF.net SVN: bailey: [17] trunk/src

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ning Li wrote:
> Database as the application interface and RangedDatabase as the
> service-provider interface sounds good. The methods in RangedDatabase
> will be very similar to those in the current RangedDatabase?

Yes.  HeapDatabase should extend this now though.

> I've been thinking about this. Finally, I think this is a possibility:
> 1 The database records and logs a deleted document and its version.
> 2 If all the replicas have recorded and logged the deleted document
>    with the same version number, the document can be removed from
>    the database. This is because any new versions of the document
>    come after will have a larger version number.
> 
> Does this sound right? Things are more complicated when we
> consider state changes...

It sounds right except for the case of a long-offline node coming back 
online.  More on that below...

> Do we allow a node to go offline for a long time? I thought we'd consider
> the node goes down and pick a replacement for it.

It would be nice to be able to eliminate long-offline nodes, but I don't 
yet see how.

At startup we want nodes to announce their content to the master.  Not 
all nodes will start at exactly the same time.  (Note also that, if the 
master fails then nodes will also re-elect a new master and post their 
state there.  Search and indexing should continue uninterrupted through 
master moves.)  So, when a master first starts it needs to avoid 
modifying the ring for a time until it assumes that all nodes are up. 
We might even have nodes randomly delay their first report, so that the 
master isn't overwhelmed.

If the network is partitioned then the master would allocate new nodes 
to underserviced regions.  When the network is repaired, we have the 
choice of ignoring the data on the nodes that were replaced, or 
synchronizing it with what has transpired in their absence.  In the case 
where all replicas of a region were offline, then we would want to use 
their data when they come online (like the system restart case), but 
when only a single replica was offline we might simply ignore its data 
and let it sync from scratch.  However it may not be easy to distinguish 
these cases.  If all replicas go offline, then we add new nodes to the 
region, we'd need to remember that, at some point in the past, all nodes 
in that region were offline.  If the master was restarted during this 
time, it will be even harder to keep track of this.

I'm still hopeful that we can come up a heuristic for this, but I need 
to think more about what it should be.

Doug