There isn't any split brain handling at the moment. JGroups can detect network partitions, but only when they heal, not when they actually split (see http://www.jgroups.org/javadoc/org/jgroups/MergeView.html). However, this is usually too late - in the case of HA-JDBC, a network partition could cause some nodes to think db2 crashed and db1 is still active, while other nodes think db1 crashed and db2 is still active. If both partitions continued to write data to their respective active databases, the changes would be irreconcilable upon merge.
The most common strategy for guarding against network partitions is to require a quorum, whereby the group size must reach/maintain a threshold number of members before being deemed active. This is generally set to: int(max group size / 2) + 1. This would allow one partition to remain active while the smaller partition halts all database access.
I was hoping to get around to implement split-brain handling in the next major/minor release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is there any third-arbitration helps them to vote a main node, and the second node die itself? it's a general behaviour in cluster implement.
Hi paul, can you help me with this topic?
There isn't any split brain handling at the moment. JGroups can detect network partitions, but only when they heal, not when they actually split (see http://www.jgroups.org/javadoc/org/jgroups/MergeView.html). However, this is usually too late - in the case of HA-JDBC, a network partition could cause some nodes to think db2 crashed and db1 is still active, while other nodes think db1 crashed and db2 is still active. If both partitions continued to write data to their respective active databases, the changes would be irreconcilable upon merge.
The most common strategy for guarding against network partitions is to require a quorum, whereby the group size must reach/maintain a threshold number of members before being deemed active. This is generally set to: int(max group size / 2) + 1. This would allow one partition to remain active while the smaller partition halts all database access.
I was hoping to get around to implement split-brain handling in the next major/minor release.