Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#9 New feature: Keep-Alive/Watchdog

open
Nikhil Tayal
main (8)
5
2010-07-27
2010-07-27
Hai Shalom
No

Currently the pcd monitors all the processes it spawns.

There could be a case, where we will want extra control on a process. Meaning that we want to make sure that it is not stuck in a loop or a semaphore.
Currently the pcd can trigger a recovery action only if a process/thread has crashed. But in case it is stuck due to a logical error, we will never know.

There are cases where such a feature is important for safety reasons and it is widely used in embedded systems.

This feature should be selective, meaning that you'll have to actively request this service from the PCD.
The implementation should involve service registration which defines what is the maximum time interval between keep alive pulses.
The requesting process should send "I'm-Alive" message at least once in each interval.
In case the process is stuck and unable to send this message, the PCD will trigger a recovery action once the timeout occurs.

Discussion