From: Himpich, S. <Ste...@se...> - 2013-07-24 17:16:12
|
Hi there! My current setup consists of 6 Servers: GTM1 - with gtm master GTM2 - with gtm standby - with gtm_proxy DB01 - with coordinator1 (master) - with datanode1 (master) DB02 - with coordinator2 (master) - with datanode2 (master) dbback01 - with datanode2_slave dbback02 - with datanode1_slave Setup using pgxc_ctl init all works fine. I simulate a servercrash by shutting down: GTM2 DB02 dbback02 Then (via pgxc_ctl) I remove: gtm slave gtm_proxy datanode2 datanode1_slave Unfortunatly, this does not work if a gtm slave is envolved. The GTM still thinks there should be a gtm standby and times out trying to contact it. 1:140702615140096:2013-07-24 19:00:01.841 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 1:140702615140096:2013-07-24 19:00:55.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 1:140702615140096:2013-07-24 19:00:58.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 1:140702615140096:2013-07-24 19:01:01.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 monitor all output when everything was fine: pgxc_ctl(32409):1307241419_00 PGXC monitor all pgxc_ctl(32409):1307241419_00 Running: gtm master pgxc_ctl(32409):1307241419_00 Running: gtm slave pgxc_ctl(32409):1307241419_00 Running: gtm proxy gtm97-proxy pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms181 pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms197 pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms181 pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms181 pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms197 pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms197 monitor all output after (intentional) crash of server*2: pgxc_ctl(32409):1307241436_36 PGXC monitor all pgxc_ctl(32409):1307241437_35 Running: gtm master pgxc_ctl(32409):1307241437_38 Not running: gtm slave pgxc_ctl(32409):1307241437_41 Not running: gtm proxy gtm97-proxy pgxc_ctl(32409):1307241437_41 Running: coordinator master coorddbms181 pgxc_ctl(32409):1307241437_59 Not running: coordinator master coorddbms197 pgxc_ctl(32409):1307241437_59 Running: datanode master datadbms181 pgxc_ctl(32409):1307241438_17 Not running: datanode slave datadbms181 pgxc_ctl(32409):1307241438_20 Not running: datanode master datadbms197 pgxc_ctl(32409):1307241438_20 Running: datanode slave datadbms197 removal of gtm slave: pgxc_ctl(32409):1307241500_38 PGXC remove gtm slave pgxc_ctl(32409):1307241500_41 Removing gtm slave. pgxc_ctl(32409):1307241500_41 Done. [removal of remaining "not runing" parts, failover of slave datenode, etc] pgxc_ctl(32409):1307241540_40 PGXC monitor all pgxc_ctl(32409):1307241540_42 Running: gtm master pgxc_ctl(32409):1307241540_42 Running: coordinator master coorddbms181 pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms181 pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms197 But - as seen above - the gtm master still thinks he has a slave and tries (forever) to contact it. Restart of the whole (remaining) cluster doesn't help, too. The same setup without a gtm slave works fine - but I need it in case 'server*1' crashes. Any thoughts on that topic, any logging I might supply? Regards, Stefan |
From: 鈴木 幸市 <ko...@in...> - 2013-07-26 01:11:28
|
Please configure gtm_proxy at DB01, DB02, dbback01 and dbback02. The reasons are: 1. gtm_proxy groups up traffic between GTM and coordinator/datanode backgrounds. Placing gtm_proxy at GTM2 will not benefit this. Pgxc_ctl looks for a gtm_proxy local to coordinator/datanode. Your configuration will make gtm_proxy available. 2. When you failover a datanode, in pgxc_ctl, a slave will look for a local gtm_proxy, otherwise the slave will be connected directly to gtm. You need to configure gtm_proxy local to dbback01 and dbback02. Hope this helps. Regards; --- Koichi Suzuki On 2013/07/25, at 2:15, "Himpich, Stefan" <Ste...@se...> wrote: > Hi there! > > My current setup consists of 6 Servers: > > GTM1 > - with gtm master > > GTM2 > - with gtm standby > - with gtm_proxy > > DB01 > - with coordinator1 (master) > - with datanode1 (master) > > DB02 > - with coordinator2 (master) > - with datanode2 (master) > > dbback01 > - with datanode2_slave > > dbback02 > - with datanode1_slave > > > Setup using pgxc_ctl init all works fine. > > I simulate a servercrash by shutting down: > GTM2 > DB02 > dbback02 > > Then (via pgxc_ctl) I remove: > gtm slave > gtm_proxy > datanode2 > datanode1_slave > > Unfortunatly, this does not work if a gtm slave is envolved. The GTM still thinks there should be a gtm standby and times out trying to contact it. > > 1:140702615140096:2013-07-24 19:00:01.841 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 > LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 > 1:140702615140096:2013-07-24 19:00:55.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 > LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 > 1:140702615140096:2013-07-24 19:00:58.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 > LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 > 1:140702615140096:2013-07-24 19:01:01.893 UTC -LOG: Failed to establish a connection with GTM standby. - 0x1dfd918 > LOCATION: gtm_standby_connect_to_standby_int, gtm_standby.c:396 > > > > monitor all output when everything was fine: > pgxc_ctl(32409):1307241419_00 PGXC monitor all > pgxc_ctl(32409):1307241419_00 Running: gtm master > pgxc_ctl(32409):1307241419_00 Running: gtm slave > pgxc_ctl(32409):1307241419_00 Running: gtm proxy gtm97-proxy > pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms181 > pgxc_ctl(32409):1307241419_00 Running: coordinator master coorddbms197 > pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms181 > pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms181 > pgxc_ctl(32409):1307241419_00 Running: datanode master datadbms197 > pgxc_ctl(32409):1307241419_00 Running: datanode slave datadbms197 > > > monitor all output after (intentional) crash of server*2: > pgxc_ctl(32409):1307241436_36 PGXC monitor all > pgxc_ctl(32409):1307241437_35 Running: gtm master > pgxc_ctl(32409):1307241437_38 Not running: gtm slave > pgxc_ctl(32409):1307241437_41 Not running: gtm proxy gtm97-proxy > pgxc_ctl(32409):1307241437_41 Running: coordinator master coorddbms181 > pgxc_ctl(32409):1307241437_59 Not running: coordinator master coorddbms197 > pgxc_ctl(32409):1307241437_59 Running: datanode master datadbms181 > pgxc_ctl(32409):1307241438_17 Not running: datanode slave datadbms181 > pgxc_ctl(32409):1307241438_20 Not running: datanode master datadbms197 > pgxc_ctl(32409):1307241438_20 Running: datanode slave datadbms197 > > removal of gtm slave: > pgxc_ctl(32409):1307241500_38 PGXC remove gtm slave > pgxc_ctl(32409):1307241500_41 Removing gtm slave. > pgxc_ctl(32409):1307241500_41 Done. > > > [removal of remaining "not runing" parts, failover of slave datenode, etc] > > pgxc_ctl(32409):1307241540_40 PGXC monitor all > pgxc_ctl(32409):1307241540_42 Running: gtm master > pgxc_ctl(32409):1307241540_42 Running: coordinator master coorddbms181 > pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms181 > pgxc_ctl(32409):1307241540_42 Running: datanode master datadbms197 > > > But - as seen above - the gtm master still thinks he has a slave and tries (forever) to contact it. > Restart of the whole (remaining) cluster doesn't help, too. > > > > The same setup without a gtm slave works fine - but I need it in case 'server*1' crashes. > > Any thoughts on that topic, any logging I might supply? > > Regards, > Stefan > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |