|
From: <be...@jb...> - 2006-07-03 12:46:24
|
Yes, I don't think using an XAResource would change anything here. In the original scenario, you *have* to be able to complete the COMMIT successfully once you voted for PREPARE. So, if a communication problem occurs sending the COMMIT decision to cluster member P, then P should be excluded from the cluster. Example: if P crashed between acking the PREPARE and receiving the COMMIT, we are fine, because when P is restarted, it will get the entire in-memory state from the cluster coordinator. The difficult case is the following: what happens if there is a *temporary* situation which prevents us from sending the COMMIT message to P, but which does *not* exclude P from the cluster ? Retry COMMIT(P) until - COMMIT(P) is successful or - there is a view excluding P ? I think the above mechanism would work quite nicely, because the fail-stop model of JGroups says that a member either successfully receives all messages, or it fails and is excluded from the group. So, COMMIT(P) would be either successfully delivered if there is only a temporary communication problem, or P would be excluded from the cluster group and therefore COMMIT(P) would not *have* to be delivered, assuming that we deliver messages only to non-faulty cluster members. The issue here though is that we need to tackle the problem of a network partition and subsequent merge... View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955031#3955031 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955031 |