Thread: [Postgres-xc-general] Some questions about postgres-XC

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-general

[Postgres-xc-general] Some questions about postgres-XC

From: Yehezkel H. <hor...@ch...> - 2013-10-06 11:53:09

First I want to thank you all for this project, it seems very interesting, involving highly sophisticated technology and answering a real need of the industry.

Second, I allow myself suggest you to consider some conventions for your mailing list (as example cUrl's Etiquette:  http://curl.haxx.se/mail/etiquette.html) as it is quite hard to follow threads in the archive.

My goal - I have an application that needs SQL DB and must always be up (I have a backup machine for this purpose).
I plan to deploy as follow:
Machine A: 1 Datanode, 1 Coordinator, 1 GTM proxy, 1 GTM
Machine B: 1 Datanode, 1 Coordinator, 1 GTM proxy, 1 GTM-slave
Both machines have my application installed on, and the clients of my application will connect to the working machine (in normal case, they can connect to either one of them with simple load-balancer, hence I need multi-mater replication).

If I understand correctly, in case of failure in Machine A, I need to promote the GTM-slave to become GTM master, and reconnect the GTM proxy - all this could be done in Machine B. Right?

My questions:

1.       In your docs, you always put the GTM in dedicated machine.

a.       Is this a requirement, just an easy to understand topology or best practice?

b.      In case of best practice, what is the expected penalty in case the GTM is deployed on the same machine with coordinator and datanode?

c.       In such deployment, is there a need for GTM proxy on this machine?

2.       What should I do after Machine A is back to life if I want:

a.       Make it act as a new slave?

b.      Make it become the master again?

I saw this question in the archive (http://sourceforge.net/p/postgres-xc/mailman/message/31302978/), but didn't found any answer:
> I suppose my question is: what do I need to do, to make the former masters
> into new slaves?  To me it would make sense to be able to failover node1
> once and then again, and be left with more or less the same configuration
> as in the beginning.  It would be okay if there is some magic command I can
> run to reconfigure a former master as the new slave.

Hope I don't ask silly questions, but I couldn't find answers in the docs/archive.

Thanks in advanced

Yehezkel Horowitz
Check Point Software Technologies Ltd.

Re: [Postgres-xc-general] Some questions about postgres-XC

From: Michael P. <mic...@gm...> - 2013-10-06 13:40:02

On Sun, Oct 6, 2013 at 7:45 PM, Yehezkel Horowitz
<hor...@ch...> wrote:
> Second, I allow myself suggest you to consider some conventions for your
> mailing list (as example cUrl’s Etiquette:
> http://curl.haxx.se/mail/etiquette.html) as it is quite hard to follow
> threads in the archive.
This is rather interesting. Thanks for pointing to that!

> My goal – I have an application that needs SQL DB and must always be up (I
> have a backup machine for this purpose).
Have you thought about PostgreSQL itself for your solution. Is there
any reason you'd need XC? Do you have an amount of data that forces
you to use multi-master architecture or perhaps PG itself could handle
it?

> I plan to deploy as follow:
> Machine A: 1 Datanode, 1 Coordinator, 1 GTM proxy, 1 GTM
> Machine B: 1 Datanode, 1 Coordinator, 1 GTM proxy, 1 GTM-slave
>
> Both machines have my application installed on, and the clients of my
> application will connect to the working machine (in normal case, they can
> connect to either one of them with simple load-balancer, hence I need
> multi-master replication).
So all your tables will be replicated.

> If I understand correctly, in case of failure in Machine A, I need to
> promote the GTM-slave to become GTM master, and reconnect the GTM proxy -
> all this could be done in Machine B. Right?
Yep, this is doable. If all your data is replicated you would be able
to do that. However you need to keep in mind that you will not be able
to write new data to node B if node A is not accessible. If you data
is replicated and you need to update a table, both nodes need to work.
Or if you want B to be still writable, you could update the node
information inside it, make it workable alone, and when server A is up
again recreate a new XC node from scratch and add it again to the
cluster.

> My questions:
>
> 1.       In your docs, you always put the GTM in dedicated machine.
> a.       Is this a requirement, just an easy to understand topology or best
> practice?
GTM consumes a certain amount of CPU and does not need much RAM, while
for your nodes you might prioritize the opposite.

> b.      In case of best practice, what is the expected penalty in case the
> GTM is deployed on the same machine with coordinator and datanode?
CPU resource consumption and reduction of performance if your queries
need some CPU with for example internal sort operations among other
things.

> c.       In such deployment, is there a need for GTM proxy on this machine?
This is actually a good question. GTM proxy is here to reduce the
amount of data exchanged between GTM and the nodes. So yes if you have
a lot of concurrent sessions in the whole cluster.

> 2.       What should I do after Machine A is back to life if I want:
> a.       Make it act as a new slave?
> b.      Make it become the master again?
There is no principle of master/slave in XC like in Postgres (well you
could create a slave node for an individual Coordinator/Datanode). But
basically in your configuration machine A and B have the same state.
Only GTM is a slave.

Regards,
-- 
Michael

Re: [Postgres-xc-general] Some questions about postgres-XC

From: Yehezkel H. <hor...@ch...> - 2013-10-08 11:29:55

>> My goal - I have an application that needs SQL DB and must always be 
>> up (I have a backup machine for this purpose).
>Have you thought about PostgreSQL itself for your solution. Is there any reason you'd need XC? Do you have an amount of data that >forces you to use multi-master architecture or perhaps PG itself could handle it?

I need multi-master capability, as clients might connect to both machines at the same time; Yes - my tables will be replicated.

>Yep, this is doable. If all your data is replicated you would be able to do that. However you need to keep in mind that you will not be able to write new data to node B if node A is not accessible. If you data is replicated and you need to update a table, both nodes need to work.

This is a surprise for me, this wasn't clear in the documentation I read nor at some PG-XC presentations I looked at in the internet.
Isn't this point one of the conditions for High-Availability of DB - allowing work to continue even if one of the machines failed?

>Or if you want B to be still writable, you could update the node information inside it, make it workable alone, and when server A is up again recreate a new XC node from scratch and add it again to the cluster.

What is the correct procedure for doing that? Is there a pgxc_ctl commands for doing that?

>> My questions:
>>
>> 1.       In your docs, you always put the GTM in dedicated machine.
>> a.       Is this a requirement, just an easy to understand topology or best
>> practice?
>GTM consumes a certain amount of CPU and does not need much RAM, while for your nodes you might prioritize the opposite.
>> b.      In case of best practice, what is the expected penalty in case the
>> GTM is deployed on the same machine with coordinator and datanode?
>CPU resource consumption and reduction of performance if your queries need some CPU with for example internal sort operations among other things.
O.K  got it; For now I'm trying to make it work, afterwards I'll take care for make it work faster.

>> 2.       What should I do after Machine A is back to life if I want:
>> a.       Make it act as a new slave?
>> b.      Make it become the master again?
>There is no principle of master/slave in XC like in Postgres (well you could create a slave node for an individual Coordinator/Datanode). >But basically in your configuration machine A and B have the same state.
>Only GTM is a slave.

Sorry, I meant in the context of GTM - how should I make MachineA a new GTM-slave or make it a GTM-master again?

Re: [Postgres-xc-general] Some questions about postgres-XC

From: Koichi S. <koi...@gm...> - 2013-10-15 02:58:13

Sorry I did not respond for a while.   Please take a look at my comment
inline.

Regards;
---
Koichi Suzuki


2013/10/8 Yehezkel Horowitz <hor...@ch...>

> >> My goal - I have an application that needs SQL DB and must always be
> >> up (I have a backup machine for this purpose).
> >Have you thought about PostgreSQL itself for your solution. Is there any
> reason you'd need XC? Do you have an amount of data that >forces you to use
> multi-master architecture or perhaps PG itself could handle it?
>
> I need multi-master capability, as clients might connect to both machines
> at the same time; Yes - my tables will be replicated.
>
> >Yep, this is doable. If all your data is replicated you would be able to
> do that. However you need to keep in mind that you will not be able to
> write new data to node B if node A is not accessible. If you data is
> replicated and you need to update a table, both nodes need to work.
>
> This is a surprise for me, this wasn't clear in the documentation I read
> nor at some PG-XC presentations I looked at in the internet.
> Isn't this point one of the conditions for High-Availability of DB -
> allowing work to continue even if one of the machines failed?
>

Postgres-XC assumes any table may be replicated or distributed so XC does
not have an operation interface assuming all the tables are replicated.
It always assumes there could be some table distributed, some replicated.

On the other hand, Postgres-XC's most important feature is to maintain
cluster-wide data integrity.   XC's replication is for HA, but to provide
scalability by proxying as many statement to local datanode and increase
parallelism.

So, when you issue a DML against a replicated table, Postgres-XC tries to
propagate it to all the nodes where it is defined over.    If any node is
not available, Postgres-XC determines it cannot maintain cluster-wide data
integrity.

We provide a couple of means to deal with this.

1. ALTER TABLE to change table's replication.   You can delete any node.
Because this change should go to any other nodes for cluster-wide data
integrity, you should have all the datanodes working.

2. Configure slaves for each master.   When one of them fails, it can be
failed over by its slave.   Typically, you can configure slaves at other
datanode's server each other.  After failover occurs (you may want to
integrate with automatic failover system such as Pacemaker and
Corosync/Heartbeat) and you feel it's not needed any longer, you can issue
ALTER TEABLE to delete failed node your cluster, issue DROP NODE as well,
and then stop the slave and release its resource.


> >Or if you want B to be still writable, you could update the node
> information inside it, make it workable alone, and when server A is up
> again recreate a new XC node from scratch and add it again to the cluster.
>
> What is the correct procedure for doing that? Is there a pgxc_ctl commands
> for doing that?
>

Hope the above helps.


>
> >> My questions:
> >>
> >> 1.       In your docs, you always put the GTM in dedicated machine.
> >> a.       Is this a requirement, just an easy to understand topology or
> best
> >> practice?
> >GTM consumes a certain amount of CPU and does not need much RAM, while
> for your nodes you might prioritize the opposite.
> >> b.      In case of best practice, what is the expected penalty in case
> the
> >> GTM is deployed on the same machine with coordinator and datanode?
> >CPU resource consumption and reduction of performance if your queries
> need some CPU with for example internal sort operations among other things.
> O.K  got it; For now I'm trying to make it work, afterwards I'll take care
> for make it work faster.
>
> >> 2.       What should I do after Machine A is back to life if I want:
> >> a.       Make it act as a new slave?
> >> b.      Make it become the master again?
> >There is no principle of master/slave in XC like in Postgres (well you
> could create a slave node for an individual Coordinator/Datanode). >But
> basically in your configuration machine A and B have the same state.
> >Only GTM is a slave.
>
> Sorry, I meant in the context of GTM - how should I make MachineA a new
> GTM-slave or make it a GTM-master again?
>

You need to configure gtm_proxy for this purpose.   Gtm_ctl provides
failover option for gtm slave to be the new gtm master.   It also provides
reconnect option for gtm_proxy to connect to the new gtm master.
 Pgxc_ctl provides this as corresponding commands.   Please take a look at
http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html


>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

Re: [Postgres-xc-general] Some questions about postgres-XC

From: Yehezkel H. <hor...@ch...> - 2013-10-14 07:17:56

2nd try.
Can you please answer my questions below?

TIA

Yehezkel Horowitz

-----Original Message-----
From: Yehezkel Horowitz 
Sent: Tuesday, October 08, 2013 2:30 PM
To: 'Michael Paquier'; <pos...@li...>
Subject: RE: [Postgres-xc-general] Some questions about postgres-XC

>> My goal - I have an application that needs SQL DB and must always be 
>> up (I have a backup machine for this purpose).
>Have you thought about PostgreSQL itself for your solution. Is there any reason you'd need XC? Do you have an amount of data that >forces you to use multi-master architecture or perhaps PG itself could handle it?

I need multi-master capability, as clients might connect to both machines at the same time; Yes - my tables will be replicated.

>Yep, this is doable. If all your data is replicated you would be able to do that. However you need to keep in mind that you will not be able to write new data to node B if node A is not accessible. If you data is replicated and you need to update a table, both nodes need to work.

This is a surprise for me, this wasn't clear in the documentation I read nor at some PG-XC presentations I looked at in the internet.
Isn't this point one of the conditions for High-Availability of DB - allowing work to continue even if one of the machines failed?

>Or if you want B to be still writable, you could update the node information inside it, make it workable alone, and when server A is up again recreate a new XC node from scratch and add it again to the cluster.

What is the correct procedure for doing that? Is there a pgxc_ctl commands for doing that?

>> My questions:
>>
>> 1.       In your docs, you always put the GTM in dedicated machine.
>> a.       Is this a requirement, just an easy to understand topology or best
>> practice?
>GTM consumes a certain amount of CPU and does not need much RAM, while for your nodes you might prioritize the opposite.
>> b.      In case of best practice, what is the expected penalty in case the
>> GTM is deployed on the same machine with coordinator and datanode?
>CPU resource consumption and reduction of performance if your queries need some CPU with for example internal sort operations among other things.
O.K  got it; For now I'm trying to make it work, afterwards I'll take care for make it work faster.

>> 2.       What should I do after Machine A is back to life if I want:
>> a.       Make it act as a new slave?
>> b.      Make it become the master again?
>There is no principle of master/slave in XC like in Postgres (well you could create a slave node for an individual Coordinator/Datanode). >But basically in your configuration machine A and B have the same state.
>Only GTM is a slave.

Sorry, I meant in the context of GTM - how should I make MachineA a new GTM-slave or make it a GTM-master again?