|
From: Doug C. <cu...@ap...> - 2008-02-28 23:42:48
|
Ning Li wrote: >> > 2) Index on a virtual node (suggest a name?): A virtual node >> > serves a number of continuous shards. For example, with >> > 3-way replication, the indexes on the virtual nodes are: >> > AB-BC-CD, BC-CD-DE, CD-DE-EA, DE-EA-AB, EA-AB-BC. > > If AB is called a "range", then "AB-BC-CD" is a "node-range"? > Feel free to give better names. :) If the rule is to index things in the three clockwise nodes, then the range of node D is (B-D], since it includes all documents after B and through D in its index. If node B crashes, D's range would become (A-D]. The master would tell it to update its index to reflect that new range, and it would notify the master when that update is complete. It would probably keep an IndexReader open on the (B-D] version of its index so that it could keep servicing queries from clients while it retrieves and indexes the documents in (A-B], and for a time after that, until all clients have retrieved the new ring map from the master. So a node may be serving multiple ranges at a time for search, and its index may contain multiple ranges. If no documents are added for a time, then things should stabilize and search ranges should map 1:1 to nodes and indexes. The node is the process, the index is its datastructure. Does that hold together? Doug |