Re: [Postgres-xc-developers] Proposed steps to remove a node

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

In this case, after ALTER TABLE REDISTRIBUTE is issued against tab1, yes, DML to tab2 may target to the datanode being removed.   However, in this case, because tab1 has been redistributed, no DML to tab1 will target to tab1.

After then when ALTER TABLE REDISTRIBUTE is issued against tab2, all the data are redistributed and after then no DML to tab2 will be targetted to the datanode to remove.

We have discussed this issue about a year ago if we should exclude and redistribute all the tables before removing datanode and concluded that this should be DBA's responsibility to exclude the datanode to be removed from all the distribution, manually or automatic.

As much as DBA issues ALTER TABLE correctly to all the tables, there's no chance to bring XC status into inconsistent status so I don't think we need DML blocking.

Any other inputs?
---
Koichi Suzuki

On 2013/04/22, at 16:42, Abbas Butt <abb...@en...> wrote:

> Consider this case:
> Assume two tables in a database: tab1 and tab2, assume both are distributed by round robin.
> Assume a client C1 is connected to the cluster and is running a loop to insert rows to tab1.
> Assume administrator connects to the cluster and issues ALTER TABLE REDISTRIBUTE for table tab2. The moment this alter finishes, assume C1 starts inserting rows to tab2, while administrator issues ALTER TABLE REDISTRIBUTE for table tab1, thinking tab2 is clear.
> Since the administrator has to issue ALTER TABLE REDISTRIBUTE table by table for all tables in all databases, client C1 would always have a chance to insert more rows in the table for which administrator has already issued ALTER TABLE REDISTRIBUTE.
> 
> For this reason we need DML blocking.
> 
> Comments/Suggestions are welcome.
> 
> On Mon, Apr 22, 2013 at 11:15 AM, 鈴木 幸市 <ko...@in...> wrote:
> Sorry Abbas,  I have a question/comment on removing a datanode.
> 
> Before DBA would like to remove a datanode, he/she must run ALTER TABLE to save all the data in the node to others.   Therefore, I'm not sure if we need another means to lock DML.
> 
> I understand TMP table will be an issue.   If a TMP table is created over multiple nodes, it is under 2PC control which is not allowed so far.   So what we can do is to create TMP table on a particular node only.   If it is the removing datanode, operation will fail and I think it can keep whole cluster in a consistent status.
> 
> Yes, it is all DBA's responsibility to make sure that no data are left in the datanode.   We have a means to make it sure as you submitted.
> 
> So again, I'm not yet sure if DML blocking is still needed.
> 
> Regards;
> ---
> Koichi Suzuki
> 
> 
> 
> On 2013/04/20, at 5:22, Abbas Butt <abb...@en...> wrote:
> 
>> Hi,
>> 
>> Here are the proposed steps to remove a node from the cluster.
>> 
>> Removing an existing coordinator
>> ==========================
>> 
>> Assume a two coordinator cluster, COORD_1 & COORD_2
>> Suppose we want to remove COORD2 for any reason.
>> 
>> 1. Stop the coordinator to be removed.
>>    In our example we need to stop COORD_2.
>> 
>> 2. Connect to any of the coordinators except the one to be removed.
>>    In our example assuming COORD_1 is running on port 5432,
>>    the following command would connect to COORD_1
>> 
>>    psql postgres -p 5432
>> 
>> 3. Drop the coordinator to be removed.
>>    For example to drop coordinator COORD_2
>> 
>>    DROP NODE COORD_2;
>> 
>> 4. Update the connection information cached in pool.
>> 
>>    SELECT pgxc_pool_reload();
>> 
>> COORD_2 is now removed from the cluster & COORD_1 would work as if COORD_2 never existed.
>> 
>> CAUTION : If COORD_2 is still running and clients are connected to it,  any queries issued would create inconsistencies in the cluster.
>> 
>> Please note that there is no need to block DDLs because either way DDLs will fail after step 1 and before step 4.
>> 
>> 
>> 
>> 
>> Removing an existing datanode
>> =========================
>> 
>> Assume a two coordinator cluster, COORD_1 & COORD_2
>> with three datanodes DATA_NODE_1, DATA_NODE_2 & DATA_NODE_3
>> 
>> Suppose we want to remove DATA_NODE_3 for any reason.
>> 
>> Further assume there is a table named rr_abc distributed in round robin fashion
>> and has rows on all the three datanodes.
>> 
>> 1. Block DMLs so that during step 2, while we are shifting data from
>>    the datanode to be removed some one could have an insert process
>>    inserting data in the same.
>> 
>>    Here we will need to add a system function similar to pgxc_lock_for_backup.
>>    This is a to do item.
>> 
>> 2. Transfer the data from the datanode to be removed to the rest of the datanodes for all the tables in all the databases.
>>    For example to shift data of the table rr_abc to the
>>    rest of the nodes we can use command
>> 
>>    ALTER TABLE rr_abc DELETE NODE (DATA_NODE_3);
>> 
>> 3. Confirm that there is no data left on the datanode to be removed.
>>    For example to confirm that there is no data left on DATA_NODE_3
>> 
>>    select c.pcrelid from pgxc_class c, pgxc_node n where
>>    n.node_name = 'DATA_NODE_3' and n.oid = ANY (c.nodeoids);
>> 
>> 4. Stop the datanode server to be removed.
>>    Now any SELECTs that involve the datanode to be removed would start failing
>>    and DMLs have already been blocked, so essentially the cluster would work
>>    only partially.
>> 
>> 5. Connect to any of the coordinators.
>>    In our example assuming COORD_1 is running on port 5432,
>>    the following command would connect to COORD_1
>> 
>>    psql postgres -p 5432
>> 
>> 6. Drop the datanode to be removed.
>>    For example to drop datanode DATA_NODE_3 use command
>> 
>>    DROP NODE DATA_NODE_3;
>> 
>> 7. Update the connection information cached in pool.
>> 
>>    SELECT pgxc_pool_reload();
>> 
>> 8. Repeat steps 5,6 & 7 for all the coordinators in the cluster.
>> 
>> 9. UN-Block DMLs
>> 
>> DATA_NODE_3 is now removed from the cluster.
>> 
>> 
>> Comments are welcome.
>> 
>> -- 
>> Abbas
>> Architect
>> 
>> Ph: 92.334.5100153
>> Skype ID: gabbasb
>> www.enterprisedb.com
>> 
>> Follow us on Twitter
>> @EnterpriseDB 
>> 
>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more
>> ------------------------------------------------------------------------------
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter_______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
> 
> 
> 
> 
> -- 
> -- 
> Abbas
> Architect
> 
> Ph: 92.334.5100153
> Skype ID: gabbasb
> www.enterprisedb.com
> 
> Follow us on Twitter
> @EnterpriseDB 
> 
> Visit EnterpriseDB for tutorials, webinars, whitepapers and more