|
From: Ning L. <nin...@gm...> - 2008-03-10 15:38:54
|
On Fri, Mar 7, 2008 at 8:36 PM, Yonik Seeley <yo...@ap...> wrote: > On Thu, Mar 6, 2008 at 5:47 PM, Ning Li <nin...@gm...> wrote: > > > 5 A document database? > > - We store documents anyway. > > - We don't support sub-document updates. > > Field updates? We could if we store all the fields. Solr has a patch > for this, but it might be more efficient to implement in Lucene. It > requires being able to get the *latest* stored fields for a doc, even > if they are uncommitted. Let's not worry about performance for now. As you pointed out, if we update one stored field for a doc, we have to figure out the "latest" of all the other stored fields for the doc - but it's impossible because of distributed update and eventual consistency. Well, we can keep a revision number for each stored field, but... > > Here are a few comments on the features: > > 1 Consistent hashing uses hash values because hash values > > distribute uniformly on the ring. Can we support > > application-specified keys for the ring? > > Seems like we could allow the user to specify their own hash value. > What's the usecase here? An example application can be an online email system. The keys of a user's emails are prefixed by the user name, so a user's emails are located together on the ring. When a user searches his/her emails, the query is only sent to servers which cover that range, instead of the entire ring. > > The difference > > is that the distribution may not be uniform so we need > > to rebalance sometimes (remove a virtual node and insert > > it somewhere else). > > I'll refer back again to my comments on separating replication (the > range of node X is replicated on nodes X-1 and X-2) from key > partitioning (the range of node X is 0-1000 + 5000-6000 for example). > One can change the key partitioning w/o touching the replication configuration. I think your point is that we need re-balancing in any case? > > 1 On the assumption that an application specifies document > > version number. It greatly simplifies things, but is > > it practical? > > I think so... > If the application can't provide it, the server (or client proxy) > could perhaps provide it via a timestamp. It's hard for the servers to sync their clocks, so timestamp is not reliable... Ning |