|
From: Joseph G. <jos...@or...> - 2012-07-06 07:32:15
|
On 6 July 2012 15:25, Michael Paquier <mic...@gm...> wrote: > > > On Fri, Jul 6, 2012 at 2:17 PM, Ashutosh Bapat > <ash...@en...> wrote: >> >> Hi Joseph, >> I have come across this question about supporting mixed distribution >> strategy a few times by now. >> >> We have to judge it's advantages (taking into consideration that there can >> be solutions outside of core XC for the same) against the efforts required >> for implementing and maintaining it. If the pains in a. using third >> party/outside XC solutions 2. implementing and maintaining it in core and >> using it are more of less same, we may have to leave it out of the core at >> least for some near future. If we take option 2 and find that using it is >> equally painful as the option 1, we wasted our effort. In order to judge the >> 2nd point, we can look at some other DBMS available with these features and >> how do they perform from various aspects. So following questions are >> relevant :- Is there another distributed database, having a similar scheme >> of mixed distribution available? How (and widely) is that feature being used >> in field? What is the pain point in using such a feature? > > Good point here, indeed. Thanks for pointing that. Indeed all excellent points. In terms of how difficult it is to integrate into core/vs using other middleware to achieve HA properties - I don't think it's easily to come up with an answer. (atleast one that isn't highly opinionated) I spent a few days building an XC cluster with streaming replication for each datanode + scripting failover events and recovery etc. The main issues I found were along the lines of lack of integration effectively. Configuring each datanode with different wal archive stores and recovery commands is very painful and difficult to understand the implications of. I did make an attempt at fixing this with even more middleware (pgpool+repmgr) but gave up after deciding that it's far too many moving parts for a DBMS for me to consider using it. I just can't see how so many pieces of completely disparate software can possibly know enough about the state of the system to make reasonable decisions with my data, which leaves me with developing my own manager to control them all.. Streaming replication is also quite limited as it allows you to replicate entire nodes only. But enough opinion. Some facts from current DBMS that are using similar replication strategies. I say similar because none of them have quite the same architecture to XC. Cassandra[1] uses consistent hashing + a replica count to achieve both horizontal partitioning and replication for read/write scalability. This has some interesting challenges for them mostly stemming from the cluster size changing dynamically, dealing with maintaining consistent hashing rings and resilvering those. In my opinion this is made harder by the fact it uses cluster gossip without any node coordinator along with it's eventual consistency guarantees. Riak[2] also uses consistent hashing however based on a per 'bucket' basis where you can set a replication count. There are a bunch more too, like LightCloud, Voldemort, DynamoDB, BigTable, HBase etc. I appreciate these aren't RDBMS systems but I don't believe that is a big deal, it's perfectly viable to have a fully horizontal scaling RDBMS too, it just doesn't exist yet. Infact by having proper global transaction management I think this is made considerably easier and more reliable. Eventual consistency and no actual master node I don't think are good concessions to make. For the most part having a global picture of the state of all data is probably the biggest advantage of implementing this in XC vs other solutions. Oher major advantages are: a) Service impact from loss of datanodes is minimized (non-existent) in the case of losing only replica(s) using middleware requires an orchestrated failover b) Time to recovery (in terms of read performance) is reduced considerably because XC is able to implement a distributed recovery of out of date nodes c) Per table replication management (XC already has this but it would be even more valuable with composite partitioning) d) Increased read performance where replicas can be used to speed up read heavy workloads and lessen the impact of read hotspots. e) In band heartbeat can be used to determine fail-over requirements, no scripting or other points of failure. f) Components required to facilitate recovery could also be used to do online repartitioning (ie. increasing the size of the cluster) g) Probably the world's first real distributed RDBMS Obvious disadvantages are: a) Alot of work, difficult, hard etc. (this is actually the biggest barrier, there are lots of very difficult challenges in partitioning data) b) Making use of most of the features of said composite table partitioning is quite difficult, it would take a long time to optimize the query planner to make good use of them. There are probably more but would most probably require a proper devils advocate to reveal them (I am human and set it my opinions unfortunately) > -- > Michael Paquier > http://michael.otacoo.com Joseph. [1] - http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency [2] - http://wiki.basho.com/Replication.html -- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846 |