From: WilliamKray Q. <wil...@gm...> - 2015-01-08 23:59:13
|
Hello, Apologies if these things have been discussed thousands of times before, but I'm feeling a bit frustrated. Background: I'm running pgxc 1.2.1, and attempting to automate deploying a cluster in AWS. AWS, as we know, likes to drop servers without a lot of warning, and all kinds of bad things happen, so my main goal is having consistent data in the cluster in the event of a node failure, as well as easily adding a new node to replace the dropped one, or expand the cluster. QUESTION 1: I'm under the impression that every table created in the database must be created with the DISTRIBUTE BY REPLICATION statement, if I want all data to be replicated, with no option to set that as the default method of distribution. Is this correct? QUESTION 2: I've also (in testing) created a database using some commands like the following: CREATE TABLE test (id int, data char(100)); ALTER TABLE test DISTRIBUTE BY REPLICATION; misc. insert query here; I then go to another node, and I can *access* all the data, but it is *not replicated to that node*. If I run: ALTER TABLE TO NODE (nodename); it then appears to have data replicated to that node (as in, if I run EXECUTE DIRECT queries on that datanode, I can now see the content living on that node). BUT, when I remove any node from the cluster by running DROP NODE, suddenly none of the nodes have any data and the table is completely empty! HELP! Are there more steps required to remove a node from a cluster? and if so, how do I configure my cluster to be resilient to node failure or unexpected server termination? QUESTION 3: I've been looking into Postgres-XL, this appears to have a couple more features than PGXC, and more recent activity on their website, but also looks like it's the same codebase. Is XC still being maintained, or has development shifted to XL? Should I switch? Thank you so much for your input. If this email is too long, please let me know and I'll break it up into more manageable chunks. |