|
From: Nikhil S. <ni...@st...> - 2012-10-24 14:11:05
|
Hi Vladimir, Thanks for your very frank perspectives on PGXC. > > Sure, XC provides thanks to its architecture naturally transparency > and scalability. > > What does XC provides? My two rhetorical questions above imply answers > "NO". Necessity to adapt application means cluster is not transparent. > Well even the MySQL cluster documentation states the below: "While many standard MySQL schemas and applications can work using MySQL Cluster, it is also true that unmodified applications and database schemas may be slightly incompatible or have suboptimal performance when run using MySQL Cluster" So transparency might come at a cost in the case of MySQL cluster as well. In general Postgres has all along believed that the user is more intelligent and will take the pains to understand the nuances of their use case and configure the database accordingly. That's why perhaps even the stock postgresql.conf configuration file is pretty conservative and users tweak it as per their requirements. > Impossibility to extend cluster online means it is not scalable. > > As you rightly mention below, this is indeed a "young" project and IMHO it's maturing along proper lines. > More over, this two issues are interrelated, because You should rewrite > "CREATE TABLE" statement every time you expand (read: recreate) Your > cluster. But this issue looks much worse if node fails containing tables > with different distributed schemas. This is uncontrollable model. > > > Load balancing can be provided between Coordinator and Datanodes > > depending on applications, or at Coordinator level > > It should not depend on application, it should be an cluster's global > function. > > > For HA, Koichi is currently working on some tools to provide that, > > Again: it should not be external tool, it should be internal, integral, > essential feature. > > Some people will say exactly the opposite. Why add something internal when off-the-shelf technologies can help with minimal internal support. Like for example the Corosync/Pacemaker LinuxHA product maybe along with some of the tools that Suzuki san has provided can be combined with the replication capabilities that Postgres (and PGXC) provides. Infact with synchronous replication in place, a properly architected solution using the above packages can provide very good HA capabilities. There's already redundancy for the coordinator nodes. > I am not sure you can that easily compare XC and mysql cluster, > > both share the same architectures, but once of the main > > I don't know what there is "the same", but in functionality it is > totally different. Mysql cluster has the precise and clear clustering > model: > > 1. If some nodes fail cluster continues to work as soon as there remains > at least one healthy node in every group. > > As long as one coordinator node is around and reachable to applications, the XC cluster continues to function. As long as datanodes are equipped with replication and an HA strategy is in place to handle datanodes going down and failing over to a promoted standby, then again the cluster continues to function. > 2. No "CREATE TABLE ... DISTRIBUTE BY ..." statement. You just define > the number of replicas at configuration level. Yes, now there are only > one option is available that make sense with two replicas, but it is > enough. > > Here seems to be the fundamental difference between mysql cluster and PGXC. Everything appears to be "replicated" in MySQL cluster and all nodes are mirror images of each other. In PGXC, data can be partitioned across nodes as well. It is for this that we provide the flexibility to the user via the DISTRIBUTE BY clause. > 3. Read and write scalability (i.e. LB) at the same time for all tables > (i.e. on the cluster level). > > AIUI, all Mysql nodes are images of each other. While that's good for reads, that is not so good for writes, no? > 4. You can add data node online, i.e. without restarting (not to mention > "recreating" as for XC) cluster. Yes, only new data will go to the new > node in this case. But You can totally redistribute it with restart. > > So it is full flagged cluster, that's not true for XC and it's a pity. > > Data node addition is a work in progress in XC currently. > > differences coming to my mind is that XC is far more flexible in > > terms of license (BSD and not GPL), and like PostgreSQL, no company > > has the control of its code like mysql products which Oracle relies > > Yes, and this is why I am persuading all developers migrate to > Postgresql. But it is off topic here where we are discussing > functionality, but not an licence issues. > > Be tolerant to my criticism, I wouldn't say You made bad thing, I was > amazing when first read "write-scalable, synchronous multi-master, > transparent PostgreSQL cluster" in Your description that I completely > and exactly copied into description of my debian package, but I was > notably disappointed after my first test showing me that it is odd with > reality. It would not be so bad itself, as soon as it is young project, > but much worse that this discussion shows there are something wrong with > Your priorities and fundamental approach. > > Criticism is welcome. It helps us improve the product as we go along :) Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |