Re: [Postgres-xc-developers] GTM Standby / slave cannot be removed if unavailable

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Please configure gtm_proxy at DB01, DB02, dbback01 and dbback02.  The reasons are:

1. gtm_proxy groups up traffic between GTM and coordinator/datanode backgrounds.   Placing gtm_proxy at GTM2 will not benefit this.   Pgxc_ctl looks for a gtm_proxy local to coordinator/datanode.   Your configuration will make gtm_proxy available.

2. When you failover a datanode, in pgxc_ctl, a slave will look for a local gtm_proxy, otherwise the slave will be connected directly to gtm.   You need to configure gtm_proxy local to dbback01 and dbback02.

Hope this helps.

Regards;
---
Koichi Suzuki

On 2013/07/25, at 2:15, "Himpich, Stefan" <Ste...@se...> wrote:

> Hi there!
> 
> My current setup consists of 6 Servers:
> 
> GTM1
> - with gtm master
> 
> GTM2
> - with gtm standby
> - with gtm_proxy
> 
> DB01
> - with coordinator1 (master)
> - with datanode1 (master)
> 
> DB02
> - with coordinator2 (master)
> - with datanode2 (master)
> 
> dbback01 
> - with datanode2_slave
> 
> dbback02
> - with datanode1_slave
> 
> 
> Setup using pgxc_ctl init all works fine.
> 
> I simulate a servercrash by shutting down:
> GTM2
> DB02
> dbback02
> 
> Then (via pgxc_ctl) I remove:
> gtm slave
> gtm_proxy
> datanode2
> datanode1_slave
> 
> Unfortunatly, this does not work if a gtm slave is envolved. The GTM still thinks there should be a gtm standby and times out trying to contact it.
> 
> 1:140702615140096:2013-07-24 19:00:01.841 UTC -LOG:  Failed to establish a connection with GTM standby. - 0x1dfd918
> LOCATION:  gtm_standby_connect_to_standby_int, gtm_standby.c:396
> 1:140702615140096:2013-07-24 19:00:55.893 UTC -LOG:  Failed to establish a connection with GTM standby. - 0x1dfd918
> LOCATION:  gtm_standby_connect_to_standby_int, gtm_standby.c:396
> 1:140702615140096:2013-07-24 19:00:58.893 UTC -LOG:  Failed to establish a connection with GTM standby. - 0x1dfd918
> LOCATION:  gtm_standby_connect_to_standby_int, gtm_standby.c:396
> 1:140702615140096:2013-07-24 19:01:01.893 UTC -LOG:  Failed to establish a connection with GTM standby. - 0x1dfd918
> LOCATION:  gtm_standby_connect_to_standby_int, gtm_standby.c:396
> 
> 
> 
> monitor all output when everything was fine:
> pgxc_ctl(32409):1307241419_00 PGXC monitor all
> pgxc_ctl(32409):1307241419_00 Running: gtm master
> pgxc_ctl(32409):1307241419_00 Running: gtm slave
> pgxc_ctl(32409):1307241419_00 Running: gtm proxy gtm97-proxy
> pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms181
> pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms197
> pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms181
> pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms181
> pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms197
> pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms197
> 
> 
> monitor all output after (intentional) crash of server*2:
> pgxc_ctl(32409):1307241436_36 PGXC monitor all
> pgxc_ctl(32409):1307241437_35 Running: gtm master
> pgxc_ctl(32409):1307241437_38 Not running: gtm slave
> pgxc_ctl(32409):1307241437_41 Not running: gtm proxy gtm97-proxy
> pgxc_ctl(32409):1307241437_41 Running: coordinator master coorddbms181
> pgxc_ctl(32409):1307241437_59 Not running: coordinator master coorddbms197
> pgxc_ctl(32409):1307241437_59 Running: datanode master datadbms181
> pgxc_ctl(32409):1307241438_17 Not running: datanode slave datadbms181
> pgxc_ctl(32409):1307241438_20 Not running: datanode master datadbms197
> pgxc_ctl(32409):1307241438_20 Running: datanode slave datadbms197
> 
> removal of gtm slave:
> pgxc_ctl(32409):1307241500_38 PGXC remove gtm slave
> pgxc_ctl(32409):1307241500_41 Removing gtm slave.
> pgxc_ctl(32409):1307241500_41 Done.
> 
> 
> [removal of remaining "not runing" parts, failover of slave datenode, etc]
> 
> pgxc_ctl(32409):1307241540_40 PGXC monitor all
> pgxc_ctl(32409):1307241540_42 Running: gtm master
> pgxc_ctl(32409):1307241540_42 Running: coordinator master coorddbms181
> pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms181
> pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms197
> 
> 
> But - as seen above - the gtm master still thinks he has a slave and tries (forever) to contact it.
> Restart of the whole (remaining) cluster doesn't help, too.
> 
> 
> 
> The same setup without a gtm slave works fine - but I need it in case 'server*1' crashes.
> 
> Any thoughts on that topic, any logging I might supply?
> 
> Regards,
> Stefan
> 
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>