|
From: Ning L. <nin...@gm...> - 2008-03-12 18:31:07
|
On Tue, Mar 11, 2008 at 9:12 AM, Yonik Seeley <yo...@ap...> wrote: > More about what rebalancing means too... when rebalancing, can you > leave all the nodes in place (the replication configuration) and just > change what keys map to a node? We have two ways to achieve load balancing. I only explicitly described one of them in the design - partitioning. Partitioning partitions a ring into ranges, a.k.a nodes if we ignore replication for a moment. The other way is the mapping from nodes to hosts. A re-balance can be achieved by changing the partitioning (adding/removing a node) and/or by changing the mapping from nodes to hosts. On Wed, Mar 12, 2008 at 9:22 AM, Yonik Seeley <yo...@ap...> wrote: > On Tue, Mar 11, 2008 at 5:49 PM, Doug Cutting <cu...@ap...> wrote: > > > An example application can be an online email system. > > > The keys of a user's emails are prefixed by the user name, > > > so a user's emails are located together on the ring. When > > > a user searches his/her emails, the query is only sent to > > > servers which cover that range, instead of the entire ring. > > > > We could just define doc ids as 128-bit numbers rather than strings. > > Then user-provided hash values wouldn't be a special case. A > > constructor could convert string ids to 128-bit ids, and also store the > > original string in a field named "id". > > Allowing the user to specify a hash value doesn't seem so different > from allowing them to specify a numeric id... it's just 32 bits vs 128 > bits. Ning's use case doesn't seem to require collision-free hashes. No matter what we decide to use, the emphasis is that the documents are not uniformly distributed on the ring. Therefore, when we (re)partition, the goal is not that the partitioned ranges are about the same size, but that the documents on the partitioned ranges are about the same size. Ning |