|
From: Leif M. <le...@ta...> - 2004-03-14 14:30:16
|
Richard, > Looking through the code in wrapper.c, the message: > > "JVM appears hung: Timed out waiting for signal from JVM." > > is printed when the primary process has not responded to a ping. > Immediately after this the function wrapperKillProcess() is called. > As far as I can see there were no "exit requests" prior to the SIGKILL. > Obviously, you are much more familiar than I am with the code, could > you point out where a previous exit request was sent after > the ping timeout so I can better understand the flow of control. Thanks. The JVM was never asked to exit using a signal. But the Wrapper does maintain a socket which is used for all such communications. In my experience when that communication link has failed, the Wrapper can assume that the JVM is frozen or at least in a very bad state. This is one of the points at which the Wrapper will attempt to kill the JVM. > > If there was no previous exit request, would there be any harm is > first sending the process a SIGTERM followed (100 ms later or so) > by a SIGKILL? I went ahead and added this. If the JVM is truly frozen it will not have any effect as the JVM will not be able to respond to the SIGTERM. This is why I was just going ahead and sending the SIGKILL when I was convinced that the JVM was dead. As I understand it. The kill function does not send the signal to the child processes so I am not sure if this change will make any difference for the problem you are having with the child processes. It may be necessary to loop over any child processes of the JVM, sending the SIGTERM then SIGKILL signals to each of them. (Need to look into whether or not this is even possible) The additional debug information associated with this will be useful in detecting whether the JVM is actually frozen or not. It has the drawback of adding up to 5 seconds to the time that it takes to kill and then restart a frozen JVM. It takes 24 hours for the public CVS archive to be synched with the dev archives. But I would appreciate it if you could check out the CVS code, build and then test this fix with your application. I am interested to find out if it makes any difference for you. > As an aside, using the "unsupported" classes > sun.misc.Signal and sun.misc.SignalHandler one can register > signal handlers. At least on Linux using java 1.4.2_03 registering > to catch TERM works but reqistering to catch KILL results in a > nice core dump at the point of registration :-) That is interesting to know about. The Wrapper already supports this when you use integration method #3. You can receive any and all signals sent to the JVM using the WrapperListener.controlEvent method. Cheers, Leif |