#270 Erroneous JVM Requests restart

v3.3.1
open
nobody
5
2012-12-11
2009-12-21
Anonymous
No

I am having this weird problem in my production environment where the wrapper thinks that the JVM has requested a restart.
Is there a mechanism by which the JVM requests a restart outside the management interface?
Nowhere in my code do I call or use the WrapperManagerMBean interface.
Any information would be most helpful.

Here is the relevant snippet from my wrapper log:

EBUG | wrapperp | 2009/12/19 12:57:20 | send a packet PING : ping
INFO | jvm 1 | 2009/12/19 12:57:20 | WrapperManager Debug: Received a packet PING : ping
INFO | jvm 1 | 2009/12/19 12:57:20 | WrapperManager Debug: Send a packet PING : ping
DEBUG | wrapperp | 2009/12/19 12:57:20 | read a packet PING : ping
DEBUG | wrapperp | 2009/12/19 12:58:21 | send a packet PING : ping
INFO | jvm 1 | 2009/12/19 12:58:22 | WrapperManager Debug: Received a packet PING : ping
INFO | jvm 1 | 2009/12/19 12:58:22 | WrapperManager Debug: Send a packet PING : ping
DEBUG | wrapperp | 2009/12/19 12:58:22 | read a packet PING : ping
DEBUG | wrapperp | 2009/12/19 12:59:23 | send a packet PING : ping
INFO | jvm 1 | 2009/12/19 12:59:23 | WrapperManager Debug: Received a packet PING : ping
DEBUG | wrapper | 2009/12/19 13:00:01 | Pause reading child output to share cycles.
DEBUG | wrapperp | 2009/12/19 13:00:01 | read a packet PING : ping
DEBUG | wrapperp | 2009/12/19 13:00:01 | read a packet RESTART : restart
STATUS | wrapper | 2009/12/19 13:00:01 | JVM requested a restart.
DEBUG | wrapper | 2009/12/19 13:00:01 | wrapperRestartProcess() called.
DEBUG | wrapperp | 2009/12/19 13:00:01 | read a packet STOPPED : 1
DEBUG | wrapper | 2009/12/19 13:00:01 | JVM signalled that it was stopped.
DEBUG | wrapperp | 2009/12/19 13:00:01 | socket read no code (closed?).
INFO | wrapper | 2009/12/19 13:00:01 | Wrapper Process has not received any CPU time for 37 seconds. Extending timeouts.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Send a packet PING : ping
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Read Timed out. (Last Ping was 30300 milliseconds ago)
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager: The Wrapper code did not ping the JVM for 30 seconds. Quit and let the Wrapper resynch.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Send a packet RESTART : restart
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Thread, Wrapper-Connection, handling the shutdown process.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: calling listener.stop()
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Waiting for WrapperListener.stop runner thread to complete.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: WrapperListener.stop runner thread started.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperSimpleApp Debug: stop(1)
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: WrapperListener.stop runner thread stopped.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: returned from listener.stop() -> 1
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: shutdownJVM(1) Thread:Wrapper-Connection
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Send a packet STOPPED : 1
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: Closing socket.
INFO | jvm 1 | 2009/12/19 13:00:01 | WrapperManager Debug: calling System.exit(1)

Discussion

  • Christian Mueller

    Hello,

    the wrapper seems to behave normally in your logfile. your system seems to be frozen or under heavy load around 2009/12/19 12:59:23 and therefore the JVM does not respond anymore. The default timeout for the wrapper to detect a frozen is 30 sec.
    As a work around, you could extend the wrapper.ping.timeout to a large
    enough value to make sure that this is not encountered again. Try
    something like this to extend the timeout from 30 seconds to 5
    minutes:
    wrapper.ping.timeout=300

    The only drawback in doing this is that the Wrapper will never decide
    that the JVM is frozen and restart it until the ping timeout has
    expired. This means that JVM service could be frozen and unavailable
    for up to 5 minutes. If the JVM process crashes and goes away, then
    the Wrapper will still restart it almost instantly.
    Regardless of the ping timeout setting, you will still see the
    "Wrapper Process has not received any CPU time for " messages in the
    logs whenever the Wrapper process is starved of CPU for more than 5
    seconds.

    Could you please do a quick grep of all of your wrapper.log files ans
    see if the above message has happened at other times as well? It may
    be useful to see when this is happening to try and track down what on
    the system could be causing it.

    Best regards,
    Christian

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous

    Anonymous - 2009-12-23

    Can someone help me on this?
    I work for RBS Sempra Commodities My company has bought 20+ professional licenses and I believe some level of support should come with these

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous

    Anonymous - 2009-12-23

    Sorry christian - I didn't see your post before writing that last message.
    However looking at this log the section starting at 13:00:01 is what has me confused.

    First the wrapper reads a ping so the JVM is alive and well. Then in the same second this message comes across:

    DEBUG | wrapperp | 2009/12/19 13:00:01 | read a packet RESTART : restart
    STATUS | wrapper | 2009/12/19 13:00:01 | JVM requested a restart.
    DEBUG | wrapper | 2009/12/19 13:00:01 | wrapperRestartProcess() called.

    It isn't until after it goes into the restart process that we see the timeout and subsequent CPU message.

    If it was just a straight timeout I would expect to see something like this in the logs:
    DEBUG | wrapperp | 2009/12/23 03:46:44 | read a packet PING : ping
    DEBUG | wrapperp | 2009/12/23 03:47:46 | send a packet PING : ping
    ERROR | wrapper | 2009/12/23 03:48:04 | JVM appears hung: Timed out waiting for signal from JVM.
    ERROR | wrapper | 2009/12/23 03:48:04 | JVM did not exit on request, terminated
    STATUS | wrapper | 2009/12/23 03:48:05 | JVM Restarts disabled. Shutting down.
    STATUS | wrapper | 2009/12/23 03:48:05 | <-- Wrapper Stopped

    The wrapper service is wrapping a Gigaspaces Grid Service Container. Unfortunately if the JVM is restarted Gigaspaces will not reprovision the application to take advantage of all the JVMs that are registered with it. You have to redeploy the application for it to "scale" back up to the number of available JVMs (at least in the version we are running currently). Because of this I have turned on disable_restarts so I know when my software is only running on a portion of the allocated hardware.

     
  • Christian Mueller

    Hello,

    Having a closer look at your log file, there are some strange things.

    It says that the Wrapper received a PING and then a RESTART packet at 2009/12/19 13:00:01 but there is no record of the RESTART being sent from the JVM. The last ping from the JVM was sent at 2009/12/19 12:59:23. So it tool 38 seconds for it to be received?

    The "Pause reading child output to share cycles." message just before that is normally sent when very large quanities of log output is sent form the JVM console without any breaks. That was not the case here, so I am not sure why that message was displayed.

    Could you send me the complete log file from that time via mail?
    support@tanukisoftware.com

    Best Regards,
    Christian

     

Log in to post a comment.