From: Francisco R. <rev...@im...> - 2005-06-18 02:00:31
|
On Fri, Jun 17, 2005 at 08:15:28PM -0400, Bill Burke wrote: > > Adrian Brock wrote: > >On Fri, 2005-06-17 at 19:44, Bill Burke wrote: > > > >>Francisco Reverbel wrote: > >> > >>>I also have some recovery-related additions to TransactionImpl. And > >>>had to change the way transaction timeouts are handled. Now the timeout > >>>clock stops ticking when a "tx committed" record (for a locally-started > >>>transaction) or a "tx prepared" record (for an imported transaction) is > >>>written out to the transaction log. > >>> > >> > >>Are you sure this is wise? What if a commit just hangs? > >> > > > > > >You break off the commit request and keep trying a periodic > >recovery until the hanging resource gets its act together. > > > > Sure, and exactly how do you do that? The only way is to it is to > interrupt the thread at some point and hope that the hanging I/O interrupts. The transaction timeout stops ticking, but another timeout clock starts to tick. The action fired by this timeout is completely different from the one fired by the transaction timeout (a different TimeoutTarget). What happens exactly depends on who you are. (1) If you are an imported transaction with an external coordinator, you start a "prepared" timeout immediately after writing your "tx prepared" record to the transaction log. If this timeout expires you call replayCompletion on your coordinator. If you cannot reach the coordinator you restart the "prepared" timeout. (2) If you are a locally-started transaction, after writing a "tx committed" record to the transaction log you call commit on your XA resources and on your remote (OTS or DTM) resources. If all commit calls succeed, you can clear that transaction in the log. If a commit call on a remote resource fails (either because the resource is not available or due to some communication problem), you do nothing and just keep the TransactionImpl instance around. Eventually the resouce will call replayCompletion on you. If a commit call on an XA resource fails with XA_RETRY or with XAERR_RMFAIL, you start a "retry" timeout. When this timeout expires you try again. If the commit still fails, you restart the timeout. You only ditch the TransactionImpl instance and clear that transaction in the log when all resource commit calls succeed. (3) If you are an imported transaction that is already prepared and receives a commit or rollback call from your coordinator, you can start calling commit or rollback on your XA resources and on your remote (OTS or DTM) resources. You do something similar to (2) above. If the transaction outcome is commit, you only ditch the TransactionImpl instance and clear that transaction in the log when all resource commit calls succeed. If the outcome is abort, you can clear the transaction in the log immediately, but you keep the TransactionImpl around until all calls to XAResource.rollback succeed. Due to presumed abort, you don't need the TransactionImpl to be around to handle a replayCompletion call from some remote resource. The meaning of "clear a transaction in the log" depends on whether the transaction is distributed or not. If it is a distributed transaction (either an imported transaction or a locally-started transaction that uses some remote resource), then it means writing a "tx end" record to the log. If it is a locally-started transaction that uses no remote resource (only XA resources), then it means telling the tx logging module that the log file containing the "tx commit" record can be erased as far as the transaction is concerned. No "tx end" record is written out to the log in this case. (I inherited this behavior from Bill's prototype, which never writes out "tx end" records.) Regards, Francisco |