Pavan Deolasee reported the following issue:
I wonder how do we currently handle node failures in XC setup. I understand, we recommend setting up replicated slaves for all the datanodes and adjust the coordinator node information in case of fail over, say by using ALTER NODE. What I am more curious is about working at a reduced availability. Let me give an example:
Say I have a table distributed by HASH on 4 nodes. If one of those nodes go down, I would still like to use the table by somehow adjusting the metadata at the coordinator. The data stored on the failed node will not be accessible, but I should be able to query the data from the remaining nodes. So SELECTs should not fail or at the least SELECT with a qualification which filters out data from the failed node should not fail. Also, INSERT/UPDATE/DELETE should succeed as long as the target node is up and running.
I tried to run a few queries to see if this is possible, but could not get the desired result. I only had limited success with replicated tables, in a sense that I could query the table after dropping the failed node at the coordinator. But even that looks fragile.