Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 31 March 2013 14:07, Abbas Butt <abb...@en...> wrote:

> Hi,
> Attached please find the revised patch for restore mode. This patch has to
> be applied on top of the patches I sent earlier for
>  3608377,
>  3608376 &
>  3608375.
>
> I have also attached some scripts and a C file useful for testing the
> whole procedure. It is a database that has many objects in it.
>
> Here are the revised instructions for adding new nodes to the cluster.
>
> ======================================
>
> Here are the steps to add a new coordinator
>
>  1) Initdb new coordinator
>   /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
> coord_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new coordinator name and pooler port
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>
>  4) Connect to any of the existing coordinators and take backup of the
> database
>     ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>
>  5) Start the new coordinator specify --restoremode while starting the
> coordinator
>     ./postgres --restoremode -D ../data_cord3 -p 5455
>
>  6) Create the new database on the new coordinator - optional
>     ./createdb test -p 5455
>
>  7) Restore the backup that was taken from an existing coordinator by
> connecting to the new coordinator directly
>     ./psql -d test -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
> -p 5455
>
>  8) Quit the new coordinator
>
>  9) Start the new coordinator as a by specifying --coordinator
>     ./postgres --coordinator -D ../data_cord3 -p 5455
>
> 10) Create the new coordinator on rest of the coordinators and reload
> configuration
>      CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
> PORT = 5455);
>     SELECT pgxc_pool_reload();
>
> 11) Quit the session of step 3, this will unlock the cluster
>
> 12) The new coordinator is now ready
>     ./psql test -p 5455
>     create table test_new_coord(a int, b int);
>     \q
>     ./psql test -p 5432
>     select * from test_new_coord;
>
> *======================================*
> *======================================*
>
> Here are the steps to add a new datanode
>
>
>  1) Initdb new datanode
>     /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
> data_node_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new datanode name
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>
>  4) Connect to any of the existing datanodes and take backup of the
> database
>     ./pg_dumpall -p 15432 -s --include-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>

Why do we need --include-nodes on datanode ?

----

+     *    The dump taken from a datanode does NOT contain any DISTRIBUTE BY
+     *    clause. This fact is used here to make sure that when the
+     *    DISTRIBUTE BY clause is missing in the statemnet the system
+     *    should not try to find out the node list itself.
+     */
+    if ((IS_PGXC_COORDINATOR || (isRestoreMode && stmt->distributeby !=
NULL))
+        && relkind == RELKIND_RELATION)

How do we enforce not having DISTRIBUTE BY clause in the pg_dump output if
it's a datanode ? Also, can we just error out in restore mode if the
DISTRIBUTE BY clause is present ?

-----

>  5) Start the new datanode specify --restoremode while starting the it
>     ./postgres --restoremode -D ../data3 -p 35432
>

It seems you have disabled use of GTM in restore mode. For e.g. in
GetNewTransactionId(), we get a global tansaction id only if it's a
coordinator or if IsPGXCNodeXactDatanodeDirect() is true. But
IsPGXCNodeXactDatanodeDirect() will now return false in restore mode. Is
there any specific reason for disabling use of GTM in restore mode ? I
don't see any harm in using GTM. In fact, it is better to start using
global xids as soon as possible.

>
>  6) Restore the backup that was taken from an existing datanode by
> connecting to the new datanode directly
>      ./psql -d postgres -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
> -p 35432
>
>  7) Quit the new datanode
>
>  8) Start the new datanode as a datanode by specifying --datanode
>     ./postgres --datanode -D ../data3 -p 35432
>
>  9) Create the new datanode on all the coordinators and reload
> configuration
>     CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
> PORT = 35432);
>     SELECT pgxc_pool_reload();
>
> 10) Quit the session of step 3, this will unlock the cluster
>
> 11) Redistribute data by using ALTER TABLE REDISTRIBUTE
>
> 12) The new daatnode is now ready
>     ./psql test
>     create table test_new_dn(a int, b int) distribute by replication;
>     insert into test_new_dn values(1,2);
>     EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>
> ======================================
>
> On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:
>
>> Feature ID 3608379
>>
>> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>>> >
>>> >
>>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>>> > <ami...@en...> wrote:
>>> >>
>>> >>
>>> >>
>>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>>> wrote:
>>> >>>
>>> >>> Hi All,
>>> >>>
>>> >>> Attached please find a patch that provides a new command line
>>> argument
>>> >>> for postgres called --restoremode.
>>> >>>
>>> >>> While adding a new node to the cluster we need to restore the schema
>>> of
>>> >>> existing database to the new node.
>>> >>> If the new node is a datanode and we connect directly to it, it does
>>> not
>>> >>> allow DDL, because it is in read only mode &
>>> >>> If the new node is a coordinator, it will send DDLs to all the other
>>> >>> coordinators which we do not want it to do.
>>> >>
>>> >>
>>> >> What if we allow writes in standalone mode, so that we would
>>> initialize
>>> >> the new node using standalone mode instead of --restoremode ?
>>> >
>>> >
>>> > Please take a look at the patch, I am using --restoremode in place of
>>> > --coordinator & --datanode. I am not sure how would stand alone mode
>>> fit in
>>> > here.
>>>
>>> I was trying to see if we can avoid adding a new mode, instead, use
>>> standalone mode for all the purposes for which restoremode is used.
>>> Actually I checked the documentation, it says this mode is used only
>>> for debugging or recovery purposes, so now I myself am a bit hesitent
>>> about this mode for the purpose of restoring.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> To provide ability to restore on the new node a new command line
>>> argument
>>> >>> is provided.
>>> >>> It is to be provided in place of --coordinator OR --datanode.
>>> >>> In restore mode both coordinator and datanode are internally treated
>>> as a
>>> >>> datanode.
>>> >>> For more details see patch comments.
>>> >>>
>>> >>> After this patch one can add a new node to the cluster.
>>> >>>
>>> >>> Here are the steps to add a new coordinator
>>> >>>
>>> >>>
>>> >>>  1) Initdb new coordinator
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>>> >>> --nodename coord_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new coordinator name and pooler port
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>
>>> >>
>>> >> I haven't given a thought on the earlier patch you sent for cluster
>>> lock
>>> >> implementation; may be we can discuss this on that thread, but just a
>>> quick
>>> >> question:
>>> >>
>>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>>> finish
>>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>>> objects
>>> >> created by these particular DDLs.
>>> >
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. Suppose one client issues a lock cluster command and the other
>>> issues
>>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer to
>>> your
>>> > question is Yes.
>>> >
>>> > Suppose you have a prepared transaction that has a DDL in it, again if
>>> this
>>> > can be considered an on going DDL, then again answer to your question
>>> is
>>> > Yes.
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. One client starts a transaction and issues a DDL, the second
>>> client
>>> > issues a lock cluster command, the first commits the transaction. If
>>> this is
>>> > an ongoing DDL, then the answer to your question is No. But its a
>>> matter of
>>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>>> the
>>> > deny camp. I decided to put it in allow camp, because I have not yet
>>> written
>>> > any code to detect whether a transaction being committed has a DDL in
>>> it or
>>> > not, and stopping all transactions from committing looks too
>>> restrictive to
>>> > me.
>>> >
>>> > Do you have some other meaning of an ongoing DDL?
>>> >
>>> > I agree that we should have discussed this on the right thread. Lets
>>> > continue this discussion on that thread.
>>>
>>> Continued on the other thread.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>>
>>> >>>  4) Connect to any of the existing coordinators and take backup of
>>> the
>>> >>> database
>>> >>>         ./pg_dump -p 5432 -C -s
>>> >>>
>>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql test
>>> >>>
>>> >>>  5) Start the new coordinator specify --restoremode while starting
>>> the
>>> >>> coordinator
>>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>>> >>>
>>> >>>  6) connect to the new coordinator directly
>>> >>>         ./psql postgres -p 5455
>>> >>>
>>> >>>  7) create all the datanodes and the rest of the coordinators on the
>>> new
>>> >>> coordiantor & reload configuration
>>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 15432, PRIMARY);
>>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 25432);
>>> >>>
>>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5432);
>>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5433);
>>> >>>
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>>  8) quit psql
>>> >>>
>>> >>>  9) Create the new database on the new coordinator
>>> >>>         ./createdb test -p 5455
>>> >>>
>>> >>> 10) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 5455
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Will pg_dumpall help ? It dumps roles also.
>>> >
>>> >
>>> > Yah , but I am giving example of pg_dump so this step has to be there.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 11) Restore the backup that was taken from an existing coordinator by
>>> >>> connecting to the new coordinator directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p
>>> 5455
>>> >>>
>>> >>> 11) Quit the new coordinator
>>> >>>
>>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Unlocking the cluster has to be done *after* the node is added into
>>> the
>>> >> cluster.
>>> >
>>> >
>>> > Very true. I stand corrected. This means CREATE NODE has to be allowed
>>> when
>>> > xc_lock_for_backup is set.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 13) Start the new coordinator as a by specifying --coordinator
>>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>>> >>>
>>> >>> 14) Create the new coordinator on rest of the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5455);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 15) The new coordinator is now ready
>>> >>>         ./psql test -p 5455
>>> >>>         create table test_new_coord(a int, b int);
>>> >>>         \q
>>> >>>         ./psql test -p 5432
>>> >>>         select * from test_new_coord;
>>> >>>
>>> >>>
>>> >>> Here are the steps to add a new datanode
>>> >>>
>>> >>>
>>> >>>  1) Initdb new datanode
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>>> --nodename
>>> >>> data_node_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new datanode name
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>>
>>> >>>  4) Connect to any of the existing datanodes and take backup of the
>>> >>> database
>>> >>>         ./pg_dump -p 15432 -C -s
>>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>>> test
>>> >>>
>>> >>>  5) Start the new datanode specify --restoremode while starting the
>>> it
>>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>>> >>>
>>> >>>  6) Create the new database on the new datanode
>>> >>>         ./createdb test -p 35432
>>> >>>
>>> >>>  7) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 35432
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>>  8) Restore the backup that was taken from an existing datanode by
>>> >>> connecting to the new datanode directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p 35432
>>> >>>
>>> >>>  9) Quit the new datanode
>>> >>>
>>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>>> >>>         ./postgres --datanode -D ../data3 -p 35432
>>> >>>
>>> >>> 12) Create the new datanode on all the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 35432);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>> >>>
>>> >>> 14) The new daatnode is now ready
>>> >>>         ./psql test
>>> >>>         create table test_new_dn(a int, b int) distribute by
>>> replication;
>>> >>>         insert into test_new_dn values(1,2);
>>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>> >>>
>>> >>> Please note that the steps assume that the patch sent earlier
>>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>>> applied.
>>> >>>
>>> >>> I have also attached test database scripts, that would help in patch
>>> >>> review.
>>> >>>
>>> >>> Comments are welcome.
>>> >>>
>>> >>> --
>>> >>> Abbas
>>> >>> Architect
>>> >>> EnterpriseDB Corporation
>>> >>> The Enterprise PostgreSQL Company
>>> >>>
>>> >>> Phone: 92-334-5100153
>>> >>>
>>> >>> Website: www.enterprisedb.com
>>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >>>
>>> >>> This e-mail message (and any attachment) is intended for the use of
>>> >>> the individual or entity to whom it is addressed. This message
>>> >>> contains information from EnterpriseDB Corporation that may be
>>> >>> privileged, confidential, or exempt from disclosure under applicable
>>> >>> law. If you are not the intended recipient or authorized to receive
>>> >>> this for the intended recipient, any use, dissemination,
>>> distribution,
>>> >>> retention, archiving, or copying of this communication is strictly
>>> >>> prohibited. If you have received this e-mail in error, please notify
>>> >>> the sender immediately by reply e-mail and delete this message.
>>> >>>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> Everyone hates slow websites. So do we.
>>> >>> Make your web apps faster with AppDynamics
>>> >>> Download AppDynamics Lite for free today:
>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>> >>> _______________________________________________
>>> >>> Postgres-xc-developers mailing list
>>> >>> Pos...@li...
>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>> >>>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > --
>>> > Abbas
>>> > Architect
>>> > EnterpriseDB Corporation
>>> > The Enterprise PostgreSQL Company
>>> >
>>> > Phone: 92-334-5100153
>>> >
>>> > Website: www.enterprisedb.com
>>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >
>>> > This e-mail message (and any attachment) is intended for the use of
>>> > the individual or entity to whom it is addressed. This message
>>> > contains information from EnterpriseDB Corporation that may be
>>> > privileged, confidential, or exempt from disclosure under applicable
>>> > law. If you are not the intended recipient or authorized to receive
>>> > this for the intended recipient, any use, dissemination, distribution,
>>> > retention, archiving, or copying of this communication is strictly
>>> > prohibited. If you have received this e-mail in error, please notify
>>> > the sender immediately by reply e-mail and delete this message.
>>>
>>
>>
>>
>> --
>> --
>> Abbas
>> Architect
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>> Phone: 92-334-5100153
>>
>> Website: www.enterprisedb.com
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>
>> This e-mail message (and any attachment) is intended for the use of
>> the individual or entity to whom it is addressed. This message
>> contains information from EnterpriseDB Corporation that may be
>> privileged, confidential, or exempt from disclosure under applicable
>> law. If you are not the intended recipient or authorized to receive
>> this for the intended recipient, any use, dissemination, distribution,
>> retention, archiving, or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify
>> the sender immediately by reply e-mail and delete this message.
>
>
>
>
> --
> --
> Abbas
> Architect
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: 92-334-5100153
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of
> the individual or entity to whom it is addressed. This message
> contains information from EnterpriseDB Corporation that may be
> privileged, confidential, or exempt from disclosure under applicable
> law. If you are not the intended recipient or authorized to receive
> this for the intended recipient, any use, dissemination, distribution,
> retention, archiving, or copying of this communication is strictly
> prohibited. If you have received this e-mail in error, please notify
> the sender immediately by reply e-mail and delete this message.
>