|
From: Ning L. <nin...@gm...> - 2008-04-03 20:40:07
|
On Wed, Apr 2, 2008 at 1:36 PM, Doug Cutting <cu...@ap...> wrote: > First, I think we need to add an abstrct Database service-provider > interface, called perhaps RangeDatabase, that's different from Database, > adding methods that will be critical to good performance that must be > implemented by, e.g., HeapDatabase and LuceneDatabase. Database as the application interface and RangedDatabase as the service-provider interface sounds good. The methods in RangedDatabase will be very similar to those in the current RangedDatabase? > Second, I don't yet see a way around checking versions when documents > are added or deleted. The ugliest bit is that we have to keep track of > the version of every document that's ever been deleted, in case a > long-offline node comes online and reports a stale addition. That table > could grow without bound. Sigh. Do you see a way around this? I've been thinking about this. Finally, I think this is a possibility: 1 The database records and logs a deleted document and its version. 2 If all the replicas have recorded and logged the deleted document with the same version number, the document can be removed from the database. This is because any new versions of the document come after will have a larger version number. Does this sound right? Things are more complicated when we consider state changes... > Perhaps a node could discard old deletions after a time, keeping track > of the log entry number of the oldest retained deletion. Attempts to > sync starting with an older entry number should be rejected and should > trigger a complete copy-based replacement of the stale index. The > hazard is that, if a document is added to a single node, then that node > goes offline for a long time, then, when it comes online, the addition > will be lost. Not great. Do we allow a node to go offline for a long time? I thought we'd consider the node goes down and pick a replacement for it. Ning |