|
From: Ning L. <nin...@gm...> - 2008-05-02 21:03:50
|
On Fri, May 2, 2008 at 4:37 PM, Yonik Seeley <yo...@ap...> wrote: > On Fri, May 2, 2008 at 4:12 PM, Ning Li <nin...@gm...> wrote: > > We need a W-way write to provide fault tolerance for the write. > > Maybe we can return a flag indicating <W nodes did the write > > and let an application decide whether it wants to redo the write? > > Returning the number of servers actually written sounds like the right approach. > It does seem hard to try and cancel the write. Because of the version number, a write is idempotent. So we don't need to cancel the write, right? > > > > - Last but not the least, a possible performance impact: > > > > a node can receive the same write request from several > > > > different nodes around the same time. > > > > > > It seems like this should be rare. If it is rare, we shouldn't do any > > > extra work to handle it. > > > > I'm not sure if this is rare. I will run a test when we experiment > > with a more realistic workload. > > Do you mean the exact same client write request, or different client > write requests that happen to be for the same document? What would be > the scenario that would trigger the first? The exact same client write request. This can arise because of log propagation. For example, a document is in the range served by nodes A, B, C and D (W = 3, N = 4). Now let's say A and B finish writing and logging the document. C is still processing the document. C also syncs the log entries from A and B and sees the same write via the log propagation. Because C hasn't finished writing the document, it goes ahead and retrieves the document from A and/or B. This is the wasted work I was worried about. I'm not sure if this is rare. I think it depends on how parallel are the writes and log syncs. It also depends on how long the document parsing/processing takes. Cheers, Ning |