Menu

XC_Watchdog

Koichi Suzuki
There is a newer version of this page. You can find it here.

XC_Watchdog why?

To monitor if each XC component is running, psql is not sufficient because it does not check gtm/gtm_proxy/datanode. Also, psql detection may take time. As discussed in the cluster summit (http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit), watchdog time will be nice for this purpose.

Here's a design of watchdog timer:

  • Have separate shared memory for each component,
  • Postmaster and gtm/gtm_proxy server main loop increment each watchdog timer,
  • Timer will be detected by separate command to report any fault

For this purpose, need some GUC and GTM/GTM-Proxy configuration parameters to specify

  • If watchdog time is on
  • Timer increment interval (maybe in milliseconds)

Shmid for each component will be kept in pg_control, gtm.control and gtm_proxy.control files.

API to attach shared memory for watchdog and read the timer value will be provided too.