Re: [bailey-developers] SF.net SVN: bailey: [17] trunk/src

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Apr 2, 2008 at 11:49 AM, Doug Cutting <cu...@ap...> wrote:
> Ning Li wrote:
>  > On Mon, Mar 31, 2008 at 6:03 PM, Doug Cutting <cu...@ap...> wrote:
>  >>  The naive approach is simply to db.getDoc() and check the version, no?
>  >
>  > Yes. The current implementation does that. I was worried about
>  > its performance because I thought db.getDoc() means retrieving
>  > a full document. But you mentioned the semantics of retrieving
>  > the outline of a document. That would work. It's time to have
>  > db.getDoc(id) and db.getDoc(id, fields)?
>
>  Maybe we need 'getVersion()' for this, that returns a special value for
>  deleted documents?
>
>
>  > Do we index a delete?
>
>  No, but we need to log them, and, when replaying logs decide whether to
>  execute them.  If you delete a doc on one node and subsequently add it
>  on another then, when the deletion is propagated to the former it should
>  be ignored.  But if you just delete it then the propagated delete should
>  be executed.  We distinguish the cases by comparing the version in the
>  deletion log event to the version in the index -- just like for adds.

Yes, we have to log and propagate deletes correctly.

What I'm worried about is the impact of the version check on the index
build performance. As you said, for general synchronization, we always
need to check versions. After we check a database/log and decide to
add/delete a document, we call Database's addDoc/removeDoc method.
In this addDoc/removeDoc method, we first parse the document for
addDoc. Then in the same critical section, we have to check again if
it is the latest version and applies the add/delete. Is checking the log
for the version for a delete expensive here? And a log is not part of
the Database abstraction, but part of RangedDatabase...

Ning