|
From: <man...@jb...> - 2006-07-03 11:50:22
|
I presume this would then create a dependency on XA-compliant TMs (for synchronous replication anyway, which uses 2PC. Async replication uses 1PC) View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955018#3955018 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955018 |
|
From: <man...@jb...> - 2006-07-03 11:55:10
|
And regarding using an XAResource rather than a Synchronization, I presume I'd still be wiring the replication code myself in XAResource.prepare(), bringing us back to the same problem of an exception that may occur in having to replicate a commit call in XAResource.commit(). Or have I misunderstood this? :-) View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955020#3955020 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955020 |
|
From: <Kev...@jb...> - 2006-07-03 12:36:12
|
anonymous wrote : I presume I'd still be wiring the replication code myself in XAResource.prepare(), bringing us back to the same problem of an exception that may occur in having to replicate a commit call in XAResource.commit() The resource is still responsible for persisting the changes related to the transaction, however that is achieved. So yes, you will still have to manage the replication across the cluster. How are you communicating with the other endpoints? If it is unicast then I would suggest creating a resource per endpoint. This solution is different for two reasons though, | The commit occurs within the context of the transaction and can therefore notify the transaction manager of failure. | The commit will be guaranteed to happen in a recovery situation | View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955033#3955033 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955033 |
|
From: <Kev...@jb...> - 2006-07-03 12:40:14
|
anonymous wrote : what happens if there is a *temporary* situation which prevents us from sending the COMMIT message to P, but which does *not* exclude P What kind of situation are you thinking about? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955034#3955034 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955034 |
|
From: <Kev...@jb...> - 2006-07-03 12:46:23
|
anonymous wrote : presume this would then create a dependency on XA-compliant TMs Any JTA compliant transaction manager will do View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955030#3955030 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955030 |
|
From: <be...@jb...> - 2006-07-03 12:46:24
|
Yes, I don't think using an XAResource would change anything here. In the original scenario, you *have* to be able to complete the COMMIT successfully once you voted for PREPARE. So, if a communication problem occurs sending the COMMIT decision to cluster member P, then P should be excluded from the cluster. Example: if P crashed between acking the PREPARE and receiving the COMMIT, we are fine, because when P is restarted, it will get the entire in-memory state from the cluster coordinator. The difficult case is the following: what happens if there is a *temporary* situation which prevents us from sending the COMMIT message to P, but which does *not* exclude P from the cluster ? Retry COMMIT(P) until - COMMIT(P) is successful or - there is a view excluding P ? I think the above mechanism would work quite nicely, because the fail-stop model of JGroups says that a member either successfully receives all messages, or it fails and is excluded from the group. So, COMMIT(P) would be either successfully delivered if there is only a temporary communication problem, or P would be excluded from the cluster group and therefore COMMIT(P) would not *have* to be delivered, assuming that we deliver messages only to non-faulty cluster members. The issue here though is that we need to tackle the problem of a network partition and subsequent merge... View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955031#3955031 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955031 |
|
From: <gal...@jb...> - 2006-07-03 12:59:13
|
I guess Bela might be refering to two nodes temporarily losing connectivity, but in the timeframe of the failure detection, the connectivity is restored and P is not excluded. Failure detection in JGroups is based around FD/FD_SOCK plus VERIFY_SUSPECT. I guess the transaction timeout would have to be bigger than the failure detection overall time. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955041#3955041 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955041 |
|
From: <be...@jb...> - 2006-07-03 13:02:48
|
Here's one: - a switch temporarily discards packets received on a given port (e.g. the IGMP snooping problem in certain routers, where routing tables are lost some times, and need to be reconstructed. During this time, all multicasts are simply discarded !). - The COMMIT(P) call times out, throwing an exception (only if the COMMIT phase is synchronous) - The switch resumes accepting packets on the given port - P is not excluded, but the COMMIT(P) call failed I think here a retry mechanism could solve this problem View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3955042#3955042 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3955042 |