|
From: Richard E. <rem...@ed...> - 2004-03-08 16:31:51
|
Leif, see below. Richard Leif Mortenson wrote: > Richard, > >> Thank you for the explanation; the message: >> "JVM did not exit on request, terminated " >> refers to the fact that a previous request to stop the process >> failed and now it will be shutdown with a SIGKILL. >> >> We run load tests every night a multiple machine types. Every couple of >> days on one of our Linux boxes after running for a couple of hours >> we get the twin messages: >> >> JVM appears hung: Timed out waiting for signal from JVM. >> JVM did not exit on request, terminated >> >> one right after the other within milliseconds. >> >> The problem that arises after sending a SIGKILL to the process >> controlled by the wrapper is that that primary process has spawned >> secondary processes (not child processes) so that killing the primary >> with at SIGKILL does not kill the secondary process - shutdown hooks >> are registered but java will not execute them when a SIGKILL is received. >> >> In the file wrapper_unix.c in the function wrapperKillProcess() how >> about first signaling with at SIGTERM, wait a while and then a SIGKILL. >> That way the primary process' shutdown hook might run? > > > The Wrapper does not send a SIGTERM to the JVM, but it does attempt to > get the > JVM to shutdown cleanly. The wrapperKillProcess function is only > called when a > clean shutdown of the JVM has failed. At that point, the JVM is most > likely not > listening for SIGTERM or any other signals. The SIGKILL is a last > resort to get > rid of it. > > Most likely your application is frozen at this point which is why it had > not responded > to the exit requests. Looking through the code in wrapper.c, the message: "JVM appears hung: Timed out waiting for signal from JVM." is printed when the primary process has not responded to a ping. Immediately after this the function wrapperKillProcess() is called. As far as I can see there were no "exit requests" prior to the SIGKILL. Obviously, you are much more familiar than I am with the code, could you point out where a previous exit request was sent after the ping timeout so I can better understand the flow of control. Thanks. If there was no previous exit request, would there be any harm is first sending the process a SIGTERM followed (100 ms later or so) by a SIGKILL? As an aside, using the "unsupported" classes sun.misc.Signal and sun.misc.SignalHandler one can register signal handlers. At least on Linux using java 1.4.2_03 registering to catch TERM works but reqistering to catch KILL results in a nice core dump at the point of registration :-) > > If you are able to reproduce this so easily. Could you try turning on > debug output > with wrapper.debug=true and then letting that run until the JVM is > restarted? The > debug output will show exactly why the Wrapper decides that it is time > to kill the > JVM process. Most likely the JVM has stopped responding to ping requests. > > Cheers, > Leif > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IBM Linux Tutorials > Free Linux tutorial presented by Daniel Robbins, President and CEO of > GenToo technologies. Learn everything from fundamentals to system > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > _______________________________________________ > Wrapper-user mailing list > Wra...@li... > https://lists.sourceforge.net/lists/listinfo/wrapper-user > |