|
From: Vladimir S. <vst...@gm...> - 2012-07-20 08:32:55
|
I have fresh installed XC consisting of two data nodes with all defaults settings. Nothing special was configured. I have created database and one table with one only text field. Then I inserted text string and tried SELECT. At this point all was OK. But after shutting down of one data node SELECT fails returning the message in subject. This is not what was I expected. DROP DATANODE doesn't help. If it is not a bug, then I have questions: 1. What should failover and then recovery procedures be after one data node fails? 2. Does this means, that XC is scalable in one only direction, that is it can be expanded, but not shrunk? In other words, we cannot remove data node. 3. Does this means, that without external infrastructure (like drbd + corosync + pacemaker) with default setup (CREATE TABLE ... DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at least for writes) capabilities? If all this true, it is not a cluster that I imagined after reading XC description and what is expected from cluster at all. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Michael P. <mic...@gm...> - 2012-07-20 09:18:35
|
On Fri, Jul 20, 2012 at 5:32 PM, Vladimir Stavrinov <vst...@gm...>wrote: > I have fresh installed XC consisting of two data nodes with all defaults > settings. Nothing special was configured. I have created database and > one table with one only text field. Then I inserted text string and > tried SELECT. At this point all was OK. But after shutting down of one > data node SELECT fails returning the message in subject. This is > not what was I expected. DROP DATANODE doesn't help. If it is not a bug, > then I have questions: > This is not a bug. What you did here was removed a component from the cluster. An incomplete cluster will not work. > > 1. What should failover and then recovery procedures be after one > data node fails? > Like postgreSQL, you can attach a slave node to a datanode and then perform a failover on it. After the master node fails for a reason or another, you will need to promote a slave waiting behind. Something like pg_ctl promote -D $DN_FOLDER is enough. This is for the Datanode side. Then what you need to do is update the node catalogs on each Coordinator to allow them to redirect to the new promoted node. Let's suppose that the node that failed was called datanodeN (you need the same node name for master and slave). In order to do that, issue "ALTER NODE datanodeN WITH (HOST = '$new_ip', PORT = $NEW_PORT); SELECT pgxc_pool_reload();" Do that on each Coordinator and then the promoted slave will be visible to each Coordinator and will be a part of cluster. > 2. Does this means, that XC is scalable in one only direction, that is > it can be expanded, but not shrunk? In other words, we cannot remove > data node. > You can remove a Datanode, just be sure that before doing that you redirect to an existing node the data of distributed tables. In 1.0 you can do that with those kind of things (want to remove data from datanodeN): CREATE TABLE new_table TO NODE (datanode1, ... datanode(N-1), datanode(N+1), datanodeP) AS SELECT * from old_table; DROP TABLE old_table; ALTER TABLE new_table RENAME TO old_table; Once you are sure that the datanode you want to remove has no unique data (don't care about replicated...), perform a DROP NODE on each Coordinator, then pgxc_pool_reload() and the node will be removed correctly. Please note that I working on a patch able to do such stuff automatically... Will be committed soon. > > 3. Does this means, that without external infrastructure (like drbd + > corosync + pacemaker) with default setup (CREATE TABLE ... > DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at least > for writes) capabilities? > Basically it has both, I know some guys who are already building an HA/LB solution based on that... -- Michael Paquier http://michael.otacoo.com |
|
From: Michael P. <mic...@gm...> - 2012-07-31 08:35:57
|
On Tue, Jul 31, 2012 at 5:19 PM, Vladimir Stavrinov <vst...@gm...>wrote: > > All the nodes in rac are replicated. > Is the same true for mysql cluster? Would You like to say that only XC > is wrtie scalable? > Sorry I didn't use a correct formulation. mysql can perform write scalability, it uses the same basics as XC, I was thinking about RAC and postgreSQL-based clusters only. > > There are several cons against that: > > - it is not possible to define a distribution key based on a column > > I believe, some other methods to make decision where to store new > incoming data, exists or may be created. At least round-robin. Other > is on LB criteria based : You choose node under least load. > For round-robin, this is definitely true, and a feature like this would be welcome in XC. This is not really dependent on the fact of deciding the distribution strategy at node level or table level. > > > - it is not possible to define range partitioning, column partitioning > > Is it so necessary for cluster solutions with distributed databases? > As user might be able to push some data of a given column or another on a node. That might be helpful for security reasons. But there is no rush in implementing such things, I am just mentioning that you solutions close definitely the door at possible extensions. > > > It is written in XC definition that it is a synchronous multi-master. > > Doing that in asynchronous way would break that, and also this way you > > No! You didn't read carefully what I wrote. We have classic > distributed XC as core of our system. It contains all complete data at > every moment and it is write scalable synchronous multi-master as usu > al. But then we can supplement it with extra replicated nodes, that > will be updated asynchronously in low priority background process in > order to keep cluster remaining write scalable. When read request come > in, it should go to replicated node if and only if requested data > exists there, otherwise such request should go to distributed node > where those data in question exists in any case. > The main problem that I see here is that replicating data asynchronously breaks MVCC. So you are never sure that the data will be here or not on your background nodes. > > >> Such architect allow to create totally automated and complete LB & HA > >> cluster without any third party helpers. If one of the distributed > >> (shark) nodes fails, it should be automatically replaced (failover) > >> with one of the up to date replicated nodes. > > > > This can be also achieved with postgres streaming replication naturally > > available in XC. > > Certainly You mean postgres standby server as method of duplicating > distributed node. We have already discussed this topic: it is one kind > of number of external HA solutions. But I wrote above something else. > I mean here that existing replicated node, that currently serve read > requests from application, can take over the role of any distributed > node in case it fail. And I suppose this failover procedure should be > automated, started on event of failure and executed in real time. > > OK, I see all I wrote here in this thread is far from current XC state > as well as from Your thoughts at all. So, You may consider all this as > my unreachable dreams. > Dreams come true. They just need to be worked on. I would advise you to propose feature design, then patches, and not only general ideas. Ideas are of course always welcome, but if you want to add some new features you will need to be more specific. -- Michael Paquier http://michael.otacoo.com |
|
From: Vladimir S. <vst...@gm...> - 2012-07-31 09:19:34
|
On Tue, Jul 31, 2012 at 05:35:45PM +0900, Michael Paquier wrote: > The main problem that I see here is that replicating data > asynchronously breaks MVCC. May I cite myself? When read request come in, it should go to replicated node if and only if requested data exists there, otherwise such request should go to distributed node where those data in question exists in any case. > So you are never sure that the data will be here or not on your > background nodes. If we control where the data stored in distributed nodes, why not to control the state of replicated nodes? In both cases we should know what data is where. > Ideas are of course always welcome, but if you want to add some new > features you will need to be more specific. I don't think what we discussing here is simply feature that may be added with an patch. The idea to move storage control on cluster level touches the basics and concept of XC. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Nikhil S. <ni...@st...> - 2012-10-24 14:11:05
|
Hi Vladimir, Thanks for your very frank perspectives on PGXC. > > Sure, XC provides thanks to its architecture naturally transparency > and scalability. > > What does XC provides? My two rhetorical questions above imply answers > "NO". Necessity to adapt application means cluster is not transparent. > Well even the MySQL cluster documentation states the below: "While many standard MySQL schemas and applications can work using MySQL Cluster, it is also true that unmodified applications and database schemas may be slightly incompatible or have suboptimal performance when run using MySQL Cluster" So transparency might come at a cost in the case of MySQL cluster as well. In general Postgres has all along believed that the user is more intelligent and will take the pains to understand the nuances of their use case and configure the database accordingly. That's why perhaps even the stock postgresql.conf configuration file is pretty conservative and users tweak it as per their requirements. > Impossibility to extend cluster online means it is not scalable. > > As you rightly mention below, this is indeed a "young" project and IMHO it's maturing along proper lines. > More over, this two issues are interrelated, because You should rewrite > "CREATE TABLE" statement every time you expand (read: recreate) Your > cluster. But this issue looks much worse if node fails containing tables > with different distributed schemas. This is uncontrollable model. > > > Load balancing can be provided between Coordinator and Datanodes > > depending on applications, or at Coordinator level > > It should not depend on application, it should be an cluster's global > function. > > > For HA, Koichi is currently working on some tools to provide that, > > Again: it should not be external tool, it should be internal, integral, > essential feature. > > Some people will say exactly the opposite. Why add something internal when off-the-shelf technologies can help with minimal internal support. Like for example the Corosync/Pacemaker LinuxHA product maybe along with some of the tools that Suzuki san has provided can be combined with the replication capabilities that Postgres (and PGXC) provides. Infact with synchronous replication in place, a properly architected solution using the above packages can provide very good HA capabilities. There's already redundancy for the coordinator nodes. > I am not sure you can that easily compare XC and mysql cluster, > > both share the same architectures, but once of the main > > I don't know what there is "the same", but in functionality it is > totally different. Mysql cluster has the precise and clear clustering > model: > > 1. If some nodes fail cluster continues to work as soon as there remains > at least one healthy node in every group. > > As long as one coordinator node is around and reachable to applications, the XC cluster continues to function. As long as datanodes are equipped with replication and an HA strategy is in place to handle datanodes going down and failing over to a promoted standby, then again the cluster continues to function. > 2. No "CREATE TABLE ... DISTRIBUTE BY ..." statement. You just define > the number of replicas at configuration level. Yes, now there are only > one option is available that make sense with two replicas, but it is > enough. > > Here seems to be the fundamental difference between mysql cluster and PGXC. Everything appears to be "replicated" in MySQL cluster and all nodes are mirror images of each other. In PGXC, data can be partitioned across nodes as well. It is for this that we provide the flexibility to the user via the DISTRIBUTE BY clause. > 3. Read and write scalability (i.e. LB) at the same time for all tables > (i.e. on the cluster level). > > AIUI, all Mysql nodes are images of each other. While that's good for reads, that is not so good for writes, no? > 4. You can add data node online, i.e. without restarting (not to mention > "recreating" as for XC) cluster. Yes, only new data will go to the new > node in this case. But You can totally redistribute it with restart. > > So it is full flagged cluster, that's not true for XC and it's a pity. > > Data node addition is a work in progress in XC currently. > > differences coming to my mind is that XC is far more flexible in > > terms of license (BSD and not GPL), and like PostgreSQL, no company > > has the control of its code like mysql products which Oracle relies > > Yes, and this is why I am persuading all developers migrate to > Postgresql. But it is off topic here where we are discussing > functionality, but not an licence issues. > > Be tolerant to my criticism, I wouldn't say You made bad thing, I was > amazing when first read "write-scalable, synchronous multi-master, > transparent PostgreSQL cluster" in Your description that I completely > and exactly copied into description of my debian package, but I was > notably disappointed after my first test showing me that it is odd with > reality. It would not be so bad itself, as soon as it is young project, > but much worse that this discussion shows there are something wrong with > Your priorities and fundamental approach. > > Criticism is welcome. It helps us improve the product as we go along :) Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 15:14:13
|
On Wed, Oct 24, 2012 at 07:40:33PM +0530, Nikhil Sontakke wrote: > "While many standard MySQL schemas and applications can work using > MySQL Cluster, it is also true that unmodified applications and > database schemas may be slightly incompatible or have suboptimal > performance when run using MySQL Cluster" I was aware of this when wrote previous message. > So transparency might come at a cost in the case of MySQL cluster as well. It is rare and specific cases and absolutely different thing then we have with XC. In XC we must take care about "CREATE TABLE ... DISTRIBUTE BY ..." EVERYWHERE and ALWAYS. > In general Postgres has all along believed that the user is more > intelligent and will take the pains to understand the nuances of > their use case and configure the database accordingly. That's why Again it is different things. It is not configuration of database. It is rewriting installation sql scripts. Imagine if You need install third party application. What about upgrade? And what about lot of such applications? No, it is not acceptable for production. This is example of core of my claims here: You don't think about real life and production environment. > perhaps even the stock postgresql.conf configuration file is pretty > conservative and users tweak it as per their requirements. To edit configuration file postgresql.conf is good idea, but rewriting installation sql script every time is very bad idea. > Impossibility to extend cluster online means it is not scalable. > > As you rightly mention below, this is indeed a "young" project and IMHO it's maturing along proper lines. Good news. News is that You agree with me in something. > Again: it should not be external tool, it should be internal, > integral, essential feature. > > Some people will say exactly the opposite. Why add something Didn't hear. > minimal internal support. Like for example the Corosync/Pacemaker > LinuxHA product maybe along with some of the tools that Suzuki san That is exactly what I am using. But it is not an alternative for internal solution. > applications, the XC cluster continues to function. As long as > datanodes are equipped with replication and an HA strategy is in > place to handle datanodes going down and failing over to a promoted > standby, then again the cluster continues to function. Good. But bad thing is that with any external solution You should twice Your hardware park for data nodes, because only half of them will be under work load. This is essential and main reason why solution should be internal. The next one is manageability and complexity of whole system. > Here seems to be the fundamental difference between mysql cluster > and PGXC. Everything appears to be "replicated" in MySQL cluster > and all nodes are mirror images of each other. In PGXC, data can be > partitioned across nodes as well. It is for this that we provide > the flexibility to the user via the DISTRIBUTE BY clause. It seems only, but is not true. All data are distributed between groups of data nodes. Replicas are inside group only. > AIUI, all Mysql nodes are images of each other. While that's good > for reads, that is not so good for writes, no? No, see above. > Data node addition is a work in progress in XC currently. I saw already: http://postgres-xc.sourceforge.net/roadmap.html But it is issue of priority. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 16:50:43
|
On Wed, Oct 24, 2012 at 06:25:56PM +0300, Andrei Martsinchyk wrote: > I guess you got familiar with other solutions out there and trying > to find in XC somesing similar. But XC is different. The main goal > of XC is scalability, not HA. Despite of its name or goal XC is distributed database only. > But it looks like we understand "scalability" differently too. The difference is that You narrow its meaning. > What would a classic database owner do if he is not satisfied with > the performance of his database? He would move to better hardware! > That basically what we mean by "scalability". If You purchase more powerful hardware to replace old one no matter it is database server or Your desktop machine it is not scalability it is rather upgrade or stepping up to happy future. > However in case of classic single-server DBMS you would notice, > that hardware cost grows exponentially. With XC you may scale > linearly - if you run XC, for example, on 8 node cluster you may > add 8 more and get 2 times more TPS. That is because XC is able to > intellegently split your data on your nodes. If you have one huge > table on N nodes you can write data N times faster, since each > particular row goes to one node and each node processes 1/Nth of > total requests. Read is scaling either - if you search by key each > node will search only local part of data, wich is N times smaller > then entire table, and all nodes search in parallel. More, if the > search key is the same as distribution key only one node will > search, that one where rows may be located perfect if there are > multiple concurrent searchers. Thank You for long explanation, but it is excess. I was aware when wrote ... But it nothing changes. > You mentioned adding nodes online. That feature is not *yet* > implemented in XC. I would not call it "scalability" though. I > would call it flexibility. It is very polite definition if we remember that it is alternative to recreating entire cluster from scratch. > That approach is not good for HA: redundancy is needed for HA, XC > is not redundant if you lost one node you lost part of data. XC > will still live in that case and it would be even able to serve > some queries. But query that needs lost No, it stops working at all. (To be sure: this was tested against 1.0.0, but 1.0.1) > node would fail. However XC supports Postgres replication, you may > configure replicas of your datanodes and switch to slave if master > fails. Currently an external solution is required to build such > kind of system. I do not think this is a problem. Nobody needs pure > DBMS anyway, at least frontend is needed. XC is a good brick to > build system that perfectly fulfill customer requirements. I already wrote: any external solution doubles hardware park and add complexity of the system. > And about transparency. Application sees XC as a generic DBMS and > can access it using generic SQL. Even CREATE TABLE without > DISTRIBUTE BY clause is supported. But like with any other DBMS In this case by default it will be "BY REPLICATION" and as result it looses main XC feature: write scalability. > database architect must know DBMS internals well and use provided But he could not know how much nodes You have or You will have and what other databases are there running and how existing data already distributed. DBMS internals is not transparency related issue at all, because there are always difference what for You are writing Your application: for mysql, for porstgresql, for oracle or for all of them. > tools, like SQL extensions to tune up specific database for > application. XC is capable to achieve much better then linear > performance when it is optimized. It is acceptable in specific cases, and should be considered as customization. But in most cases we need common solution. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Andrei M. <and...@gm...> - 2012-10-24 20:19:06
|
2012/10/24 Vladimir Stavrinov <vst...@gm...> > On Wed, Oct 24, 2012 at 06:25:56PM +0300, Andrei Martsinchyk wrote: > > > I guess you got familiar with other solutions out there and trying > > to find in XC somesing similar. But XC is different. The main goal > > of XC is scalability, not HA. > > Despite of its name or goal XC is distributed database only. > > > But it looks like we understand "scalability" differently too. > > The difference is that You narrow its meaning. > > > What would a classic database owner do if he is not satisfied with > > the performance of his database? He would move to better hardware! > > That basically what we mean by "scalability". > > If You purchase more powerful hardware to replace old one > no matter it is database server or Your desktop machine it is not > scalability it is rather upgrade or stepping up to happy future. > > That is the reason to buy latest IPhone. Some servers run for years without even reboot. Usually people are replacing servers only if they really need to do that. > > However in case of classic single-server DBMS you would notice, > > that hardware cost grows exponentially. With XC you may scale > > linearly - if you run XC, for example, on 8 node cluster you may > > add 8 more and get 2 times more TPS. That is because XC is able to > > intellegently split your data on your nodes. If you have one huge > > table on N nodes you can write data N times faster, since each > > particular row goes to one node and each node processes 1/Nth of > > total requests. Read is scaling either - if you search by key each > > node will search only local part of data, wich is N times smaller > > then entire table, and all nodes search in parallel. More, if the > > search key is the same as distribution key only one node will > > search, that one where rows may be located perfect if there are > > multiple concurrent searchers. > > Thank You for long explanation, but it is excess. I was aware when > wrote ... But it nothing changes. > > > You mentioned adding nodes online. That feature is not *yet* > > implemented in XC. I would not call it "scalability" though. I > > would call it flexibility. > > It is very polite definition if we remember that it is alternative to > recreating entire cluster from scratch. > > Nobody upgrades daily. I think it is not a lot of trouble to recreate cluster once per few years. > > That approach is not good for HA: redundancy is needed for HA, XC > > is not redundant if you lost one node you lost part of data. XC > > will still live in that case and it would be even able to serve > > some queries. But query that needs lost > > No, it stops working at all. (To be sure: this was tested against 1.0.0, > but 1.0.1) > > I think your test was incorrect. It works. > > node would fail. However XC supports Postgres replication, you may > > configure replicas of your datanodes and switch to slave if master > > fails. Currently an external solution is required to build such > > kind of system. I do not think this is a problem. Nobody needs pure > > DBMS anyway, at least frontend is needed. XC is a good brick to > > build system that perfectly fulfill customer requirements. > > I already wrote: any external solution doubles hardware park and add > complexity of the system. > > Why it doubles hardware park, multiple components may share same hardware. HA solution means extra complexity either it external or internal. There are people out there who do not want that complexity, they are happy with just performance scalability. They could use XC as is. If there is demand of HA on market, other developers may create XC-based solutions, more or less integrated. Consumers may choose one of those solutions. Everybody wins. If XC integrates one approach it will lose flexibility in this area. > > And about transparency. Application sees XC as a generic DBMS and > > can access it using generic SQL. Even CREATE TABLE without > > DISTRIBUTE BY clause is supported. But like with any other DBMS > > In this case by default it will be "BY REPLICATION" and as result it > looses main XC feature: write scalability. > > The criteria is pretty complex. However HASH distribution takes priority. > > > database architect must know DBMS internals well and use provided > > But he could not know how much nodes You have or You will have and what > other databases are there running and how existing data already > distributed. DBMS internals is not transparency related issue at all, > because there are always difference what for You are writing Your > application: for mysql, for porstgresql, for oracle or for all of them. > > I did not quite understand what you mean here. There are a lot of important for system design things along all the hardware and software stack. The more is known to developers the better result will be. One may design database on XC if he does know anything about it at all, with pure SQL, and the database will work. But much better result can be achieved if database is designed consciously. Number of nodes does not matter for distribution planning, btw. > > tools, like SQL extensions to tune up specific database for > > application. XC is capable to achieve much better then linear > > performance when it is optimized. > > It is acceptable in specific cases, and should be considered as > customization. But in most cases we need common solution. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > > -- Andrei Martsinchyk StormDB - http://www.stormdb.com The Database Cloud |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 22:14:34
|
On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: > I think your test was incorrect. It works. It is so simple, that it hard to make something wrong. You can easily reproduce it on 1.0.0 with simple SELECT request. I will repeat it on 1.0.1 meanwhile. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
|
From: Vladimir S. <vst...@gm...> - 2012-10-25 07:01:15
|
On Thu, Oct 25, 2012 at 12:18 AM, Andrei Martsinchyk <and...@gm...> wrote: > I think your test was incorrect. It works. No, it is exactly what this thread started from and what indicated in its subject. See very first answer of developer: it is not even a bug, it is by design. Sounds like anecdote, but it is true. > performance scalability. They could use XC as is. If there is demand of HA > on market, other developers may create XC-based solutions, more or less Do You really have question about this? I think High Availability is priority number one because we are not very happy sitting in Rolls-Royce that can not move. |
|
From: Mason S. <ma...@st...> - 2012-07-31 12:18:27
|
On Tue, Jul 31, 2012 at 5:19 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Tue, Jul 31, 2012 at 05:35:45PM +0900, Michael Paquier wrote: > >> The main problem that I see here is that replicating data >> asynchronously breaks MVCC. > > May I cite myself? > > When read request come in, it should go to replicated node if and only if > requested data exists there, otherwise such request should go to distributed > node where those data in question exists in any case. > >> So you are never sure that the data will be here or not on your >> background nodes. > > If we control where the data stored in distributed nodes, why not to control > the state of replicated nodes? In both cases we should know what data is where. > If a data node has one or more sync rep standbys, it should be theoretically possible to read balance those if that intelligence is added to the coordinator. It would not matter if that data is in a "distributed" or "replicated" table. If it were asynchronous, there would be more tracking that would have to be done to know if it is safe to load balance. In some tests we have done at StormDB, the extra overhead for sync rep is small, so you might as well use sync rep. With multiple replicas, in sync rep mode it will continue working even there is a failure with one of the replicas. >> Ideas are of course always welcome, but if you want to add some new >> features you will need to be more specific. > > I don't think what we discussing here is simply feature that may be added > with an patch. The idea to move storage control on cluster level touches > the basics and concept of XC. Yeah, that is sounding different than what is done here. I am not sure I understand what your requirements are and what exactly it is you need. If it is about HA, there are a lot of basics in place to build out HA from that will probably meet your needs. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud |
|
From: Koichi S. <koi...@gm...> - 2012-08-01 00:48:12
|
Yes, reading from datanode slaves will enhance read scalability. In terms of reading from datanode slave, I think we still need a couple of improvements: 1. If you want to connect directly to a slave, current datanode expects all the connections are from coordinators and they supply GXID and snapshot from GTM, which psql or current libpq don't do. If datanode is in recovery mode and standby_mode is on, then it should use XID and snapshot from WAL, which is now overridden by GXID and snapshot from coordinators/GTM. 2. If you want to connect via coordinator, which is not supported yet and we need coordinator extension. 3. If you want to visit multiple datanodes, you may get different visibility from datanode to datanode, because synchronous replication implies time lag from "receiving" WAL records to "replaying" them. The time lag may be different from datanode to datanode and the query result could be incorrect. I guess "BARRIER" may work to synchronize the visibility among the datanode but we may need another visibility control infrastructure for hot standby. Any more inputs are welcome. Regards; ---------- Koichi Suzuki 2012/7/31 Mason Sharp <ma...@st...>: > On Tue, Jul 31, 2012 at 5:19 AM, Vladimir Stavrinov > <vst...@gm...> wrote: >> On Tue, Jul 31, 2012 at 05:35:45PM +0900, Michael Paquier wrote: >> >>> The main problem that I see here is that replicating data >>> asynchronously breaks MVCC. >> >> May I cite myself? >> >> When read request come in, it should go to replicated node if and only if >> requested data exists there, otherwise such request should go to distributed >> node where those data in question exists in any case. >> >>> So you are never sure that the data will be here or not on your >>> background nodes. >> >> If we control where the data stored in distributed nodes, why not to control >> the state of replicated nodes? In both cases we should know what data is where. >> > > If a data node has one or more sync rep standbys, it should be > theoretically possible to read balance those if that intelligence is > added to the coordinator. It would not matter if that data is in a > "distributed" or "replicated" table. > > If it were asynchronous, there would be more tracking that would have > to be done to know if it is safe to load balance. > > In some tests we have done at StormDB, the extra overhead for sync rep > is small, so you might as well use sync rep. With multiple replicas, > in sync rep mode it will continue working even there is a failure with > one of the replicas. > > >>> Ideas are of course always welcome, but if you want to add some new >>> features you will need to be more specific. >> >> I don't think what we discussing here is simply feature that may be added >> with an patch. The idea to move storage control on cluster level touches >> the basics and concept of XC. > > Yeah, that is sounding different than what is done here. I am not sure > I understand what your requirements are and what exactly it is you > need. If it is about HA, there are a lot of basics in place to build > out HA from that will probably meet your needs. > >> >> -- >> >> *************************** >> ## Vladimir Stavrinov >> ## vst...@gm... >> *************************** >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > > -- > Mason Sharp > > StormDB - http://www.stormdb.com > The Database Cloud > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
|
From: Vladimir S. <vst...@gm...> - 2012-07-30 10:03:39
|
On Fri, Jul 20, 2012 at 06:18:22PM +0900, Michael Paquier wrote: > Like postgreSQL, you can attach a slave node to a datanode and then perform a failover on it. > After the master node fails for a reason or another, you will need to promote a slave waiting behind. > Something like pg_ctl promote -D $DN_FOLDER is enough. > This is for the Datanode side. > Then what you need to do is update the node catalogs on each Coordinator to allow them to redirect to the new promoted > node. > Let's suppose that the node that failed was called datanodeN (you need the same node name for master and slave). > In order to do that, issue "ALTER NODE datanodeN WITH (HOST = '$new_ip', PORT = $NEW_PORT); SELECT pgxc_pool_reload();" > Do that on each Coordinator and then the promoted slave will be visible to each Coordinator and will be a part of > cluster. If You don't do this every day there are chances You make an error. How much time it take in this case? As I wrote above, it is not own XC HA feature, but rather external cluster infrastructure. As such it is better to use mentioned above tandem drbd + corosync + pacemaker - at least it get failover automated. > In 1.0 you can do that with those kind of things (want to remove data from datanodeN): > CREATE TABLE new_table TO NODE (datanode1, ... datanode(N-1), datanode(N+1), datanodeP) AS SELECT * from old_table; > DROP TABLE old_table; > ALTER TABLE new_table RENAME TO old_table; > Once you are sure that the datanode you want to remove has no unique data (don't care about replicated...), perform a > DROP NODE on each Coordinator, > then pgxc_pool_reload() and the node will be removed correctly. Looks fine! What if there are thousands such tables to be relocated (it is real case)? And as I see, to do opposite operation, i.e. adding data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? It doesn't look like HA. > Please note that I working on a patch able to do such stuff automatically... Will be committed soon. It is hopeful news. > DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at least > for writes) capabilities? > > Basically it has both, I know some guys who are already building an HA/LB solution based on that... What do You mean? As we saw above HA is external and LB is a question either read or write. Yes, we have one only variant of such solution: when all tables are replicated we have "internal" HA and "read" LB. But such solution is implemented in many other technologies apart from XC. But as far as I understand, the main feature of XC is what named as "write-scalable, synchronous multi-master" OK, I still hope, I make right decision choosing XC as cluster solution. But now summarizing discussed problematic I have further question: Why do You implemented distribution types on table level? It is very complex for using and is not transparent. For example, when You need to install third party application You need to revise their all sql scripts to add DISTRIBUTE BY statement if You don't want defaults. What do You think about implementing different data node types (instead of tables), i.e. "distributed" and "replicated" nodes? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Michael P. <mic...@gm...> - 2012-07-30 13:21:16
|
On Mon, Jul 30, 2012 at 7:03 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Fri, Jul 20, 2012 at 06:18:22PM +0900, Michael Paquier wrote: > > > Like postgreSQL, you can attach a slave node to a datanode and then > perform a failover on it. > > After the master node fails for a reason or another, you will need to > promote a slave waiting behind. > > Something like pg_ctl promote -D $DN_FOLDER is enough. > > This is for the Datanode side. > > Then what you need to do is update the node catalogs on each > Coordinator to allow them to redirect to the new promoted > > node. > > Let's suppose that the node that failed was called datanodeN (you > need the same node name for master and slave). > > In order to do that, issue "ALTER NODE datanodeN WITH (HOST = > '$new_ip', PORT = $NEW_PORT); SELECT pgxc_pool_reload();" > > Do that on each Coordinator and then the promoted slave will be > visible to each Coordinator and will be a part of > > cluster. > > If You don't do this every day there are chances You make an error. How > much time it take in this case? As I wrote above, it is not own XC HA > feature, but rather external cluster infrastructure. As such it is better > to use mentioned above tandem drbd + corosync + pacemaker - at least it > get failover automated. > I do not mean to perform such operations manually. It was just to illustrate how to do it. Like PostgreSQL, XC provides to the user the necessary interface to perform easily and externally failover and HA operation. Then, the architect is free to use the HA utilities he wishes to perform any HA operation. In your case, a layer based on pacemaker would work. However, XC needs to be able to adapt to a maximum number of HA applications and monitoring utilities. The current interface fills this goal. > > > In 1.0 you can do that with those kind of things (want to remove data > from datanodeN): > > CREATE TABLE new_table TO NODE (datanode1, ... datanode(N-1), > datanode(N+1), datanodeP) AS SELECT * from old_table; > > DROP TABLE old_table; > > ALTER TABLE new_table RENAME TO old_table; > > Once you are sure that the datanode you want to remove has no unique > data (don't care about replicated...), perform a > > DROP NODE on each Coordinator, > > then pgxc_pool_reload() and the node will be removed correctly. > > Looks fine! What if there are thousands such tables to be relocated (it > is real case)? And as I see, to do opposite operation, i.e. adding > data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? > It doesn't look like HA. > In 1.0, yes. And this is only necessary for hash/modulo/round robin tables. > > > Please note that I working on a patch able to do such stuff > automatically... Will be committed soon. > > It is hopeful news. > The patch is already committed in master branch. So you can do it in a simple command. > > > DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at > least > > for writes) capabilities? > > > > Basically it has both, I know some guys who are already building an > HA/LB solution based on that... > > What do You mean? I mean: - HA: XC provides the necessary interface to allow other external tools to perform operations like for postgres. - LB: There is an automatic load balancing between Datanodes and Coordinator by design. Load balancing at Coordinator level has to be managed by an external tool. > As we saw above HA is external and LB is a question > either read or write. Yes, we have one only variant of such solution: > when all tables are replicated we have "internal" HA and "read" LB. But > such solution is implemented in many other technologies apart from XC. > But as far as I understand, the main feature of XC is what named as > "write-scalable, synchronous multi-master" > symmetric. > > OK, I still hope, I make right decision choosing XC as cluster > solution. But now summarizing discussed problematic I have further > question: Why do You implemented distribution types on table level? In a cluster what is important is to limit the amount of data exchanged between nodes to reach good performance. In order to accomplish that, you need a control of tables joins. In XC, maximizing performance is simply sending as many joins as possible to remote nodes, reducing the amount of data exchanged between nodes by that much. There are multiple ways to control data joins, like caching the data at Coordinator level for reuse, what pgpool-II does. But in this case how to manage prepared plans or write operations? This is hardly compatible with multi-master. Hence, the control is given to the tables. Explaining why distribution is controlled like this. It is very complex for using and is not transparent. For example, when You > need to install third party application You need to revise their all sql > scripts to add DISTRIBUTE BY statement if You don't want defaults. What > do You think about implementing different data node types (instead of > tables), i.e. "distributed" and "replicated" nodes? > Well, the only extension that XC adds is that, and it allows to perform either read and/or write scalability in a multi-master symmetric cluster, so that's a good deal! -- Michael Paquier http://michael.otacoo.com |
|
From: Vladimir S. <vst...@gm...> - 2012-07-31 10:33:23
|
On Mon, Jul 30, 2012 at 10:21:06PM +0900, Michael Paquier wrote: > is real case)? And as I see, to do opposite operation, i.e. adding > data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? > It doesn't look like HA. > > In 1.0, yes. And this is only necessary for hash/modulo/round robin tables. No, it is necessary for replicated tables too. More over, in summary it is necessary in any case of nodes adding or removal with any type of tables, replicated or distributed. This is experimental fact. The reason is that You cannot use ALTER TABLE statement to change the nodes list. The result is that if single node fail, cluster of any configuration stop working. In case of replicated node You still can do SELECT but not INSERT or UPDATE. And You can't simply remove failed node in this case too. Eventually any changes in nodes number lead to recreating overall cluster from scratch: You should drop databases and restore it from backup. It is realty now. Sorry. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Vladimir S. <vst...@gm...> - 2012-07-31 12:25:48
|
On Tue, Jul 31, 2012 at 08:11:21AM -0400, Mason Sharp wrote: > If a data node has one or more sync rep standbys, it should be > theoretically possible to read balance those if that intelligence is "sync rep" means no "write balance" -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Jim M. <ji...@gm...> - 2012-10-24 15:42:53
|
On Wed, Oct 24, 2012 at 11:13 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Wed, Oct 24, 2012 at 07:40:33PM +0530, Nikhil Sontakke wrote: > >> "While many standard MySQL schemas and applications can work using >> MySQL Cluster, it is also true that unmodified applications and >> database schemas may be slightly incompatible or have suboptimal >> performance when run using MySQL Cluster" > > I was aware of this when wrote previous message. > >> So transparency might come at a cost in the case of MySQL cluster as well. > > It is rare and specific cases and absolutely different thing then we have with > XC. In XC we must take care about "CREATE TABLE ... DISTRIBUTE BY ..." > EVERYWHERE and ALWAYS. That's not actually the case. XC will automatically distribute the table even if the DISTRIBUTE BY clause is not in the CREATE TABLE statement. It uses the primary key and foreign keys information to determine a distribution key if one is not provided. In many cases this is perfectly acceptable and completely transparent to the application. I've moved over several websites to XC never needing to touch the DDL. > > >> In general Postgres has all along believed that the user is more >> intelligent and will take the pains to understand the nuances of >> their use case and configure the database accordingly. That's why > > Again it is different things. It is not configuration of database. It > is rewriting installation sql scripts. Imagine if You need install third > party application. What about upgrade? And what about lot of such > applications? No, it is not acceptable for production. > > This is example of core of my claims here: You don't think about real life > and production environment. > > >> perhaps even the stock postgresql.conf configuration file is pretty >> conservative and users tweak it as per their requirements. > > To edit configuration file postgresql.conf is good idea, but rewriting > installation sql script every time is very bad idea. > >> Impossibility to extend cluster online means it is not scalable. >> >> As you rightly mention below, this is indeed a "young" project and IMHO it's maturing along proper lines. > > Good news. News is that You agree with me in something. > >> Again: it should not be external tool, it should be internal, >> integral, essential feature. >> >> Some people will say exactly the opposite. Why add something > > Didn't hear. > >> minimal internal support. Like for example the Corosync/Pacemaker >> LinuxHA product maybe along with some of the tools that Suzuki san > > That is exactly what I am using. But it is not an alternative for internal > solution. > >> applications, the XC cluster continues to function. As long as >> datanodes are equipped with replication and an HA strategy is in >> place to handle datanodes going down and failing over to a promoted >> standby, then again the cluster continues to function. > > Good. But bad thing is that with any external solution You should twice > Your hardware park for data nodes, because only half of them will be > under work load. This is essential and main reason why solution should > be internal. The next one is manageability and complexity of whole > system. > >> Here seems to be the fundamental difference between mysql cluster >> and PGXC. Everything appears to be "replicated" in MySQL cluster >> and all nodes are mirror images of each other. In PGXC, data can be >> partitioned across nodes as well. It is for this that we provide >> the flexibility to the user via the DISTRIBUTE BY clause. > > It seems only, but is not true. All data are distributed between groups > of data nodes. Replicas are inside group only. > >> AIUI, all Mysql nodes are images of each other. While that's good >> for reads, that is not so good for writes, no? > > No, see above. > >> Data node addition is a work in progress in XC currently. > > I saw already: > > http://postgres-xc.sourceforge.net/roadmap.html > > But it is issue of priority. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 16:53:45
|
On Wed, Oct 24, 2012 at 11:42:43AM -0400, Jim Mlodgenski wrote: > That's not actually the case. XC will automatically distribute the > table even if the DISTRIBUTE BY clause is not in the CREATE TABLE In this case by default it will be "BY REPLICATION" and as result it looses main XC feature: write scalability. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 22:05:28
|
On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: > That is the reason to buy latest IPhone. Some servers run for years > without even reboot. Usually people are replacing servers only if > they really need to do that. What about security patches for kernel? For years without reboot? And it is not only reason to upgrade kernel. As for replacing, Yes it true, but this moment inevitably come when new software eats more resources while number of users increases, but I never hear somebody says it is scaling process. > Nobody upgrades daily. I think it is not a lot of trouble to > recreate cluster once per few years. Once per few years You can built totally new system on brand-new technology. Cluster scalability imply possibility to scale it at any moment for example (but not only) when new customers or partners come with new demand for fast paced company with increasing load. It is by design. It is exactly what for the scalable cluster exists: you can scale (expand) existing system instead of building new one. > Why it doubles hardware park, multiple components may share same hardware. As usual here it is far from reality. It is not common approach acceptable for most companies. What You talking about looks like an approach for clouds or any other service providers where hardware may be shared by their customers. > HA solution means extra complexity either it external or internal. But it makes difference. External should be built and managed by users, while internal is complete and transparent solution provided by authors. With mysql cluster there are nothing to do with HA for users at all, it just already "exists". > There are people out there who do not want that complexity, they > are happy with just performance scalability. They could use XC as Will they happy with data lost and down time? Who they are? > one of those solutions. Everybody wins. If XC integrates one > approach it will lose flexibility in this area. and gain much more users. > I did not quite understand what you mean here. There are a lot of > important for system design things along all the hardware and > software stack. The more is known to developers the better result > will be. One may design database on XC if he does know anything > about it at all, with pure SQL, and the database will work. But > much better result can be achieved if database is designed > consciously. Number of nodes does not matter for distribution > planning, btw. Again: all of this is not about transparency. You are talking perhaps about installing single application on fresh XC. But what if You install third party application on existing XC already running multiply applications? What if those databases distributed in different ways. What if because of this You can not use all nodes for new application? In this case You must rewrite all "CREATE TABLE" statements to distribute tables to concrete nodes by concrete way. In this case developer doesn't help and it is not what named "transparency." *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
|
From: Vladimir S. <vst...@gm...> - 2012-10-25 07:38:04
|
On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: >> one of those solutions. Everybody wins. If XC integrates one >> approach it will lose flexibility in this area. > > and gain much more users. OK. Paulo don't wants more users, because he don't like easy ways and simple things. But we all want flexibility. Flexibility is good thing and here is example. We have cluster consists of 4 nodes. Nodes organized in groups. All data distributed between groups and every group contains the identical data, i.e. replicas. In this case with such model we have 3 options: 1. Read scalability only with 4 replicas in group. 2. Read and write scalability with 2 replicas per group. 3. Write scalability only with 1 replica per group. It is obvious: with more nodes we have more options, i.e. more flexibility. It means here the trade off between read and write scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE BY ..." I think it is enough for most cases. |
|
From: Paulo P. <pj...@ub...> - 2012-10-25 07:43:52
|
On 25/10/12 08:37, Vladimir Stavrinov wrote: > On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov > <vst...@gm...> wrote: >> On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: >>> one of those solutions. Everybody wins. If XC integrates one >>> approach it will lose flexibility in this area. >> and gain much more users. > OK. Paulo don't wants more users, because he don't like easy ways and > simple things. But we all want flexibility. Flexibility is good thing > and here is example. I didn't say "I don't want more users". I just believe, based on my experience, that subjects as advanced as the ones we're discussing don't come easy. And they shouldn't in the sense that people should really learn/know about what they're doing, regarding clustering, HA, etc.! > > We have cluster consists of 4 nodes. Nodes organized in groups. All > data distributed between groups and every group contains the identical > data, i.e. replicas. In this case with such model we have 3 options: > > 1. Read scalability only with 4 replicas in group. > 2. Read and write scalability with 2 replicas per group. > 3. Write scalability only with 1 replica per group. > > It is obvious: with more nodes we have more options, i.e. more > flexibility. It means here the trade off between read and write > scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE > BY ..." I think it is enough for most cases. > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Paulo Pires |
|
From: Mason S. <ma...@st...> - 2012-07-31 16:02:45
|
On Tue, Jul 31, 2012 at 8:25 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Tue, Jul 31, 2012 at 08:11:21AM -0400, Mason Sharp wrote: > >> If a data node has one or more sync rep standbys, it should be >> theoretically possible to read balance those if that intelligence is > > "sync rep" means no "write balance" > I think you misunderstood. Tables can be either distributed or replicated across the database segments. Each segment in turn can be have multiple synchronous replicas, similar to PostgreSQL's synchronous replication. So, for your large write-heavy tables, you should distribute amongst multiple nodes, gaining write scalability. The overhead and added latency for having replicas of each database segment is relatively small, so you need not think of that as preventing "write balance", as you say. For tables where you want read scalability, you would want to replicate those. This is at the table level, not the database segment level. The coordinator will read balance those today. If you also want read-balancing for distributed (non-replicated) tables across the database segment replicas, that has not yet been implemented, but is definitely doable (if your company would like to sponsor such a change, we are happy to implement it). Supporting such a change would involve changing the coordinator code to be able to know about database segment replicas. Up until now the project has focused on the challenges of the core database and not so much dealing with stuff on the outside of it, like HA. I hope that helps. Regards, Mason > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud Also Offering Postgres-XC Support and Services |
|
From: Vladimir S. <vst...@gm...> - 2012-08-01 09:21:17
|
On Tue, Jul 31, 2012 at 8:02 PM, Mason Sharp <ma...@st...> wrote: > I think you misunderstood. Tables can be either distributed or > replicated across the database segments. Each segment in turn can be > have multiple synchronous replicas, similar to PostgreSQL's > synchronous replication. Thank You very much for clarification! It is the same as written on XC home page. If I don't understand that, I couldn't write all above in this thread as well I couldn't provide overall tests of all of those feature before write here. Do You read this thread completely? > multiple nodes, gaining write scalability. The overhead and added > latency for having replicas of each database segment is relatively > small, so you need not think of that as preventing "write balance", as > you say. Write scalability ( I prefer term, which You are using here - "write balance" because scalability means changing of data nodes number) means that You can write to all N nodes faster then to single one. This is possible only for distributed data. If You write all 100% data to every node it is not possible. If You don't want consider standby server as node - it is wrong, because for load balancing every hardware node is meaningful. Meanwhile, I don't like idea of using standby at all, because it should be consider as external solution. When I wrote above about "asynchronous replication", I imply improving existing XC replication technology, but on node level instead of table. > know about database segment replicas. Up until now the project has > focused on the challenges of the core database and not so much dealing > with stuff on the outside of it, like HA. I thought HA & LB is main feature of any cluster. |
|
From: Jim M. <ji...@gm...> - 2012-10-24 17:00:57
|
On Wed, Oct 24, 2012 at 12:53 PM, Vladimir Stavrinov <vst...@gm...> wrote: > On Wed, Oct 24, 2012 at 11:42:43AM -0400, Jim Mlodgenski wrote: > >> That's not actually the case. XC will automatically distribute the >> table even if the DISTRIBUTE BY clause is not in the CREATE TABLE > > In this case by default it will be "BY REPLICATION" and as result it > looses main XC feature: write scalability. The default will to distribute by HASH if it has some sort of valid column to use. If there is no way to determine which column to use, it will fall back and use a round robin distribution. It never uses "BY REPLICATION" by default. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > |
|
From: Vladimir S. <vst...@gm...> - 2012-10-24 20:49:28
|
On Wed, Oct 24, 2012 at 01:00:51PM -0400, Jim Mlodgenski wrote: > The default will to distribute by HASH if it has some sort of valid My congratulations! I was thought so too ... before have tested it. But my surprise was when I've found the same data on every node. More over, despite of redundancy, XC stop working if one node fails. But it's no matter, because more important thing is that in any case for every table You should choose either read or write scalability, rewriting "CREATE TABLE" accordantly, while mysql cluster provides both at the same time for all tables without any headache about distribution schemas. i.e. all data are replicated and distributed at the same time. The only essential difference, that prevent consider mysql cluster as alternative for XC is that as I mentioned earlier, it is in-memory data base and as so it is limited in size, while XC have no such limit. Though be aware it's all about 1.0.0. I don't test all of these features against 1.0.1 yet. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |