|
From: Leif M. <le...@ta...> - 2004-05-26 21:17:20
|
Jennifer Kolar wrote: > Sorry if I wasn't clear. > What I am seeing is the processes under the wrapper just hang. They > don't exit, they don't report errors, they just hang. > The CPU is at 100% and has been for some time. I have on INFO level > logging from the wrapper which I believe is enough to see messages > about not getting enough CPU time.. Is the entire JVM hung, or is your application simply not responding. The debug output should show which is the case. If the JVM were truly hung then it would not be responding to ping requests and the JVM would be restarted by the Wrapper after your 300 second ping timeout expires. As long as the JVM is responding to pings then it will not restart the JVM even if the rest of your application is "hung". > > I will try to reproduce w/ debug on so I can give more information. I > hopefully just rolled out code to fix the CPU load issue, so that may > be harder :) You may want to temporarily back out your change and create the debug output to make sure that there is not a wrapper related problem. > > So on the other note, for 3.1.0 you are saying that if there is a ping > timeout and the wrapper restarts, the second time there is a restart > request it will not happen? This is functionality I use as well... There are two ways for a ping timeout to occur. If the Wrapper sends the JVM a ping and the JVM does not respond within the specified time then the Wrapper will assume that the JVM is hung, kill it and then restart the JVM. The second way is if the JVM does not receive any ping requests for longer than the ping timeout. In this case, the JVM assumes that the Wrapper process was killed or that something bad happened to the communications socket. To be safe, it quits. If the Wrapper is still alive then it should restart a new JVM instance. This was working through 3.0.5. But when I reorganized the Wrapper's state engine for 3.1.0, I introduced a bug where the Wrapper is not interpreting that JVM exit as a desire to shutdown the Wrapper as well. Rather than launching a new JVM, the Wrapper is simply stopping. This is what I fixed for 3.1.1. Normal restart requests and restarts caused by other problems are all working correctly. The case that is not working is quite rare. But I was thinking that you may have been encountering it. From this message however, it does not sound like you are. Cheers, Leif > On May 26, 2004, at 12:55 AM, Leif Mortenson wrote: > >> Jennifer, >> Sorry. I may still be reacting to jet lag. But I am not clear on >> your problem. >> When you say that the "processes just stop running". Are you >> meaning that the >> JVM and Wrapper are stopping? If so, what are you seeing in your >> wrapper.log? >> Would it be possible for you to reproduce this with >> wrapper.debug=true set >> so that I can see what is going on a little better. >> >> I did just fix a problem for 3.1.1 where the JVM was not being >> restarted in the >> event that the JVM restarted itself due to a ping timeout. This was >> a bug introduced >> in 3.1.0 and does not sound like the problem you are seeing. >> >> When the CPU(s) is pegged at 100% that does not mean that all >> processes are not >> getting any CPU. If the applications are being nice then other >> processes will get just >> as much CPU as they need, but not any more. The total CPU ends up >> pegged. >> >> Cheers, >> Leif >> >> Jennifer Kolar wrote: >> >>> I have processes running under servicewrapper 3.1.0 that are >>> regularly becoming CPU bound.. which is a separate problem in and of >>> itself that I am addressing.. >>> >>> However, I would expect the servicewrapper to restart those >>> services.... (I have seen this same problem under 3.0.5 by the way) >>> >>> >>> What I see is that the processes just stop running- cpu is at 100% >>> (4 cpu machine- all pegged).. no errors, no messages from the wrapper. >>> Any ideas? >>> >>> How is it that wrapper code has the cpu cycles to run and know to >>> restart the process if the CPU is at 100%? >>> >>> here are the settings in my properties files that would have any >>> relationship to this at all. >>> >>> # How long to wait [seconds] between when the JVM says it has stopped >>> # and seeing if the JVM has actually terminated. >>> # A value of <0 means no timeout will be enforced. >>> wrapper.jvm_exit.timeout=30 >>> >>> # How long to wait for the cpu [seconds] before declaring it timed-out >>> # a value of <0 means no timeout will be enforced. >>> wrapper.cpu.timeout=10 >>> >>> # How long [seconds] to allow between pings before considering the >>> VM timed-out >>> # Max allowed time is 3600 sec or 1 hour. >>> # Must be atleast 5 seconds longer than the ping interval. >>> # Must be longer than the CPU timeout. >>> # A value of <0 means no timeout will be enforced. >>> wrapper.ping.timeout=300 >>> >>> # How often [seconds] to send pings to the JVM to see if it is alive >>> # Must be atleast 1 second. >>> wrapper.ping.interval=5 >>> >>> # How long [seconds] to wait before issuing reset of JVM- only applies >>> # to resets, not to initial start. >>> wrapper.restart.delay=10 >>> >>> # Max number of times to restart(or start) invocation of JVM >>> # must be atleast 1 >>> # Note- this only applies to startup attempts, not anything else >>> # best to make atleast 2... >>> wrapper.max_failed_invocations=3 >>> >>> # How long [seconds] an application has to run to be considered >>> successfully invoked. >>> # don't leave too long- since if we restart between processing and >>> have been >>> # running less than this time we will only be allowed >>> "max_failed_invocations" >>> # to restart.. and not the usual unlimited number. >>> wrapper.successful_invocation_time=2 >>> >>> # How long [seconds] to allow for startup before declaring it timed-out >>> # and restarting >>> # Must be longer than the CPU timeout. >>> # A value of <0 means no timeout will be enforced. >>> wrapper.startup.timeout=30 >>> >>> #whether to use system time or the internal tick timer >>> # set to false to use new experimental tick timer >>> # version 3.1.0 and later >>> wrapper.use_system_time=FALSE >>> >>> # Whether you want a thread dump if the JVM failed to exit nicely >>> wrapper.request_thread_dump_on_failed_jvm_exit=FALSE >>> >>> #Whether or not system signals should be ignored >>> # If set to TRUE, CTRL-C will NOT stop the process.... >>> # Only System.exit (if shutdown hooks are not disabled) and internal >>> # programatic stop commands will stop the service. (and of course >>> through >>> # the service control panel) >>> wrapper.ignore_signals=TRUE >>> >>> #whether or not shutdown hooks should be ignored >>> # Setting this to TRUE means System.exit will result in >>> # a restart. Setting it to FALSE means System.exit will be treated >>> # as a purposeful shutdown and actually exit. >>> wrapper.disable_shutdown_hook=TRUE >>> >>> >>> Thanks, >>> Jennifer >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.Net email is sponsored by: Oracle 10g >>> Get certified on the hottest thing ever to hit the market... Oracle >>> 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. >>> http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >>> _______________________________________________ >>> Wrapper-user mailing list >>> Wra...@li... >>> https://lists.sourceforge.net/lists/listinfo/wrapper-user >>> >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by: Oracle 10g >> Get certified on the hottest thing ever to hit the market... Oracle >> 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. >> http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >> _______________________________________________ >> Wrapper-user mailing list >> Wra...@li... >> https://lists.sourceforge.net/lists/listinfo/wrapper-user > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle > 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Wrapper-user mailing list > Wra...@li... > https://lists.sourceforge.net/lists/listinfo/wrapper-user > |