|
From: Ning L. <nin...@gm...> - 2008-04-10 17:11:17
|
On Tue, Apr 8, 2008 at 5:39 PM, Doug Cutting <cu...@ap...> wrote: > > Halfen a hot range? More some more sophisticated algorithm. > > We can still have nodes evenly around the ring at the start. > > The partitioning algorithm doesn't change. Only the replication > > algorithm is different. > > But we'll need to dynamically balance things, since load may be uneven > around the ring, especially with user-provided ring positions. And once > we permit splitting, don't we have to handle overlapping ranges? With non-overlapping replication, if we decide to split a range, we split all the replicas of the range. Same as with overlapping replication, we use the master to coordinate the process. > > What do you mean by "without permitting it"? > > Just that, if we split a node in a disjoint partitioning then we'll have > overlapping ranges during the split. Switching all clients to search > the new half-sized nodes won't be atomic. So I'm not sure that disjoint > ranges really simplifies things fundamentally. Let's say the replication is 2 and node A and node B are the two replicas of range [0, 4). Now we decide to split [0, 4) into two: [0, 2) and [2, 4). After the split, node A and node B will be the replicas of [0, 2) and node C and node D will be created to be the replicas of [2, 4). The split process works as follows: 1 Master tells some hosts to create node C and node D of [2, 4) from node A or node B 2 C and D tell the master they are ready 3 Master changes the map to A and B serving [0, 2) and C and D serving [2, 4) 4 Master tells A and B to stop serving [2, 4) and only serve [0, 2) 5 A and B tell the master they are done Care should be taken in synchronizing logs in the process. Here, it doesn't matter if at one point in time, node X serves R1, node Y serves R2, and R1 and R2 overlap. What does matter is that there is no overlap in the map which the master maintains. And because clients get the map from the master, clients do not see any overlap in the map, either. What do you think? > > I think this is a fair comparison so the search performance should > > be comparable. What do you think? > > Good point. > > One other difference is that, with overlapping ranges, when a node > fails, its load is spread more evenly over the remaining nodes. In this > example, the load of any single node would become shared by two other > nodes rather than one, so each node only needs 50% headroom rather than > 100%. Yes, that's the advantage of the overlapping replication scheme. I'm coding the add-node process with the overlapping replication right now. It's taking some time. After it's finished, we can review the code and have a complete picture of the complexity and the performance implications. Ning |