From: Mason S. <ms...@tr...> - 2014-02-07 15:55:13
|
On Fri, Feb 7, 2014 at 2:39 AM, ZhangJulian <jul...@ou...>wrote: > Hi Hisada, > > It is great to know the resource agent will be released within a few > monthes. Thank your for your work and I am glad to be one of your first > batch of users. > > About the feature of PGXC internal HA, I just think it is a attractive > feature from a user's perspective, you had mentioned it has some > advantanges such as no need to install other external tools. Just now, I > read the Admin Guide of Greenplum, it seems that GP has the internal HA > support by a process named *ftsprobe*. > > I was thinking each Coordinator will fork one more process at the starting > time along with the autovacuum/bgwriter processes, and the new process will > do all the work as the Pacemaker does. > > When the GTM is down, each Coordinator will recognize it when it fetches > the snapshots from GTM, then it will talk with other Coordinators and > negociate to restart the GTM master or promote the GTM slave to master. But > I am not sure how to send the RESTART GTM or PROMOTE GTM SLAVE command from > a Coordinator process. Maybe the PROMOTE command can be replaced by a API > invocation to the GTM component. > > When one coordinator is down, when the other coordinators execute a DDL > (or each coordinator could send SELECT 1+1 to other coordinators > periodically to verify it they are all alive), they will find the failed > coordinator, then the alived coordinators can decide to remove the failed > coordinator from the pgxc_nodes. > > When one datanode is down, the coordinator will know it when it sends the > REMOTE QUEYR to data node, or it can also send the SELECT 1+1 to each > datanodes periodically. Then all the coordinator will negociate to promote > the DataNode slave to master. > > > But maybe it is not a better solution if the Pacemaker is easier to use? > for example, we can develop a PGXC-Pacemaker Glue layer which can fetch all > the cluster configuration from PGXC and then configure Pacemaker > automatically.... > These are all good thoughts and somewhat along the lines of what I have been thinking as well. We have been using Corosync/Pacemaker for quite some time. It works, but in hindsight I wish we would have put effort into an internal solution. While the current solution works, we have spent a lot of time tweaking and maintaining. In the past we have had seen aggressive failovers unnecessarily, for example. Also, it takes some resources and it does not like to manage too many components at once. In our case, we like to have two replicas of each data node on the other servers that have masters. Making node membership more flexible and getting components to agree when to failover is likely better long term solution. There would be more upfront effort, but easier installation and less management and maintenance long term. Let me know if you have the time to collaborate on such a development effort if we undertake this at some point. Our other product, TED (unrelated to Postgres-XC), manages failover internally and works well, including automatic recovery of downed nodes. We can perhaps draw on lessons there, too. -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |