|
From: Ning L. <nin...@gm...> - 2008-02-27 20:58:14
|
On Tue, Feb 26, 2008 at 1:21 PM, Yonik Seeley <yo...@ap...> wrote:
> I think one problem we are running into is due to using the same hash
> ring for both partitioning the key space and selecting placement of
> those keys. Putting many virtual nodes on the hash ring to make the
> hash function fair, in conjunction with using that hash ring to do
> replication, causes all the indicies to be mixed together.
What we have discussed addressed partitioning and replication:
1 We used consistent hashing to partition keys into ranges.
We may call a range "shard". The important thing is, it is
a one-to-one mapping between ranges and virtual nodes.
2 We described one replication scheme on how a virtual node
replicates its range/shard on N-1 virtual nodes with replication
level set to N.
You described how to partition keys into ranges/shards. And
how ranges/shards map to virtual nodes is to be figured out.
I think what we have discussed is a reasonable scheme. But
I'm always open to new ideas. :)
At the same time, should we also discuss what the update
model should be:
1 One updatable replica vs. all updatable replicas. The former
is simple. The latter is powerful. Is there sufficient need for
the latter to justify its complexity?
2 The atomicity of an insert/delete/update operation. When
an insert/delete/update operation is done, does it mean:
1) the new doc is indexed in the memory of the node
2) the new doc is indexed on the local disk of the node
3) the new doc is logged on the local disk of the node
4) the new doc is logged in some fault-tolerant shared FS
(e.g. HDFS)
5) the new doc is indexed in the memory of at least X
nodes
The probability of the operation getting lost is from high
to low: 1), then 2) and 3), then 4) and 5).
Regards,
Ning
|