|
From: Leif M. <le...@ta...> - 2003-11-05 21:24:37
|
Bill, Paul, >I am also working an issue where my app JVM exits for no apparent cause, >anywhere between 15 minutes and more than 1 month after startup. I have >not found the cause yet, but with lots of help from Leif, I believe the >cause is related to a deadlock and I have settled into the following >configuration: > >-Wrapper version 3.0.5 (earlier versions had a bug where the pause was >short enough that sometimes the requested thread dump would be >truncated). > > Let me describe this a little more to avoid confusion. The Wrapper has a feature where it can optionally request a thread dump just before forcibly killing a JVM process which is not responding. In versions prior to 3.0.5, the Wrapper was only allowing 1 second for this dump to complete. If the system was heavily loaded, this was not enough time and only a partial thread dump would be logged. 3.0.5 increased the wait time to 3 seconds which always appeared to be long enough. That said, while doing some more testing related to Bill's problem, I discovered that due to the way I was implementing this there was still a chance of a truncated dump if the dump size combined with any additional log output was larger than the buffer size of the pipe between the wrapper and the JVM. I was not able to reproduce a case where this was a problem, but noticed the possibility while looking at the code. The next release of the Wrapper will contain a fix for this potential problem. >-Additions to the wrapper configuration file: > wrapper.request_thread_dump_on_failed_jvm_exit=true > wrapper.debug=true > > Try setting the wrapper.request_thread_dump_on_failed_jvm_exit=true property in addition to what I asked you to set in the last email. It may be useful if that failure mode is detected, but from what you said, it sounds to me like the JVM process is crashing, so this would not apply. It will not hurt to set this however. >-Set the loglevel for both the console and logfile to DEBUG. >-Start the application from a console instead of from a service (wrapper >-c ...). A thread dump can only be generated if the JVM is attached to a >console. > > All versions of the Wrapper up to and including version 3.0.5 had a problem where the Wrapper process was not able to invoke a thread dump in the JVM process when running as an NT service. This was because the Windows API requires that a process share a console with any process being sent a BREAK signal and an NT service does not have a console by default. Invoking a thread dump from within the JVM using any of the available methods all worked correctly. This was only an issue with the wrapper.request_thread_dump_on_failed_jvm_exit=true property. This problem too has been fixed for the next release. The Wrapper now has the ability to display a text console when running as an NT service. When the above property is set, the console is created, but hidden, making it possible for the thread dump to be requested. >I am waiting for the next failure to generate a complete thread dump, so >I cannot tell you how helpful this will be, but I am hopeful. > >One other thing. You mention garbage collection (GC). There are times >during GC when the JVM simply stops doing anything else, including >servicing Wrapper pings. These pauses can vary in length, depending on a >number of things including the GC strategy you are employing. One >condition that has a dramatic effect on GC pause time is whether or not >any of the JVM is in virtual memory. In my app, when any part of the JVM >goes into virtual memory, GC pause time can increase by two orders of >magnitude. I have seen pause times of over 90 seconds, which, depending >on your wrapper settings, is more than enough to cause the wrapper to >restart. > > Good luck. Let me know when you get a log output. Please post the log to be directly in a zip file and then post the same reply minus the log file to the list. Cheers, Leif |