Menu

High_Availability

Koichi Suzuki

DRAFT

Postgres-XC and High-Availability

To configure Postgres-XC high availability feature, you should do it for each component, namely, gtm, gtm_proxy, coordinator and datanode.'_ '_They provide specific backup and failover mechanism. Before digging their feature in detail, let's see what we should to when specific XC comopnent crashes.

This page assumes your are familar with Postgres-XC configuration. You may want to visit [Configuration] and [Real_Server_Configuration] pages.

HA configuration for each component

The following describes overview of how to configure HA feature of each component. Integrating them into Pacemaker/Heartbeat is now being done by another team. This information will be provided elsewhere.

GTM crash

Because GTM is the central component which provides key MVCC (Multi-Version Concurrent Control) and sequence information to all the other components, we must have its backup to maintain current transaction and sequence status and failover to the backup when GTM crashes. This is calles GTM-Standby and is implemented as a part of GTM.

When GTM fails over to the standby, connection information (host and port) may change. This change may have to be informed to components connecting directly to it.

Visit [GTM_Standby_Configuration] for more details.

GTM Proxy

GTM proxy is just a proxy to group up communication between GTM and Coordinator/Datanode for performance. It does not maintain any persistent data so you can just restart GTM proxy when it crashes. When GTM crashes, GTM proxy can accept a command to reconnect to the new GTM without any transaction loss. This is another reason why you should use GTM Proxy.

Visit [GTM-Proxy_HA_configuration] for more details.

Coordinator

Coordinator stores only catalog data to handle SQL statements and does not store rows of tables. Therefore, you don't need specific coordinator backup. In this case, applications should not connect to the crashed coordinator. Because DDL will visit the crashed coordinator, you must drop this coordinator from the cluster configuration. You can restore coordinator by copying other coordinator's database in offline basis.

Of course, you can use shared disk system and failover the coordinator to other server using the same data. Another means is to use synchronous replication slave.

When a coordinator fails, you should tell other coordinators to remove the failed coordinator from your Postgres-XC cluster by "DROP NODE" statement. Also, to prevent failed node to accept any incoming transactions, you should be sure to kill it.

Datanode

Datanode stores actual rows of tables. Please note that transactions which do not need data on crashed datanode can continue to run without any error. However, in the case of distributed tables, there are no replica of rows and you need backup of each datanode to handle datanode crash. We can use synchronous replication slave for this purpose. Of course, you can use traditional shared disk configuration.

Visit [Datanode_HA_configuration] for more details.


Related

Postgres-XC: Configuration
Postgres-XC: Datanode_HA_configuration
Postgres-XC: Developers_Page
Postgres-XC: GTM-Proxy_HA_configuration
Postgres-XC: GTM_Standby_Configuration
Postgres-XC: Real_Server_Configuration

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.