From: <ch...@da...> - 2004-06-02 10:19:43
|
Hi Grzegorz, Thanks for your comments, I'll try to clarify my earlier message. I want to use computation reconstruction for Fault Tolerance issues in multi-threaded programs altough it can be used in several fields like debugging. Multi-threaded applications are non-deterministic by nature, so if a MT program fails is not simple to reproduce the state previous to the failure due to the thread scheduling. My main purpose is to be able to recover a state similar to the one that a MT application have before the failure and then continue with the computation. The general idea is: 1)if a MT program is composed only by threads and shared objects, 2)the threads' state are checkpointed regularly and 3)the order in which the objects' methods are invoked during the execution is logged; then after a program failure we just simply have to create a new process that recover the threads' state from the lastest checkpoints of the original process and re-play the method invocations that were logged in the original order and continue the execution. Of course there are a lot of issues that have to consider in order to reach a consistent state after a failure and recovery but I hope to have clarified a little bit my original question. By the way, for my purposes, objects' addresses are useless because they're valid only for the original process. Best regards. Carlos |