Menu

#298 wrapperp stop pinging back to the JVM on Linux

v3.4.0
open
nobody
Linux (1)
4
2013-07-29
2013-07-26
Luke Xu
No

Hi,

We encountered some problem using java service wrapper to run one of our java dispatcher program. Everything used to be running without problems. But last Friday, we deployed the project planning software 2plan on the box we run wrapper, then the problem occurred.

The wrapper can be started without problem. However soon the java program it started was stopped.

We turned on the debug wrapper log. In the beginning everything is well, the wrapper and the JVM was pinging each other for monitoring. Then in less than 5 minutes, the wrapper would receive a ping package from the JVM, but never ping back. Thus the JVM found itself lost the wrapper, stopped itself, signal the wrapper and wait for restart.

In the best situation the JVM would be restart by 5 times for and the wrapper exit. But most of the time, the wrapper just printed out a line saying restart the JVM 5 seconds later and fell into sleep. A typical log piece ended like the following.

DEBUG | wrapperp | 2013/07/26 17:26:41 | read a packet PING : ping
DEBUG | wrapperp | 2013/07/26 17:26:45 | send a packet PING : ping
INFO | jvm 2 | 2013/07/26 17:26:45 | WrapperManager Debug: Received a packet PING : ping
INFO | jvm 2 | 2013/07/26 17:26:45 | WrapperManager Debug: Send a packet PING : ping
DEBUG | wrapperp | 2013/07/26 17:26:45 | read a packet PING : ping
DEBUG | wrapper | 2013/07/26 17:27:46 | Signal trapped. Details:
DEBUG | wrapper | 2013/07/26 17:27:46 | signal number=17 (SIGCHLD), source="unknown"
DEBUG | wrapper | 2013/07/26 17:27:46 | Received SIGCHLD, checking JVM process status.
DEBUG | wrapper | 2013/07/26 17:27:46 | JVM process exited with a code of 0, leaving the wrapper exit code set to 0.
ERROR | wrapper | 2013/07/26 17:27:46 | JVM exited unexpectedly.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Read Timed out. (Last Ping was 29400 milliseconds ago)
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Read Timed out. (Last Ping was 58800 milliseconds ago)
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager: The Wrapper code did not ping the JVM for 58 seconds. Quit and let the Wrapper resynch.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Send a packet RESTART : restart
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Thread, Wrapper-Connection, handling the shutdown process.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: calling listener.stop()
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Waiting for WrapperListener.stop runner thread to complete.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: WrapperListener.stop runner thread started.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperStartStopApp Debug: stop(1)
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperStartStopApp Debug: invoking stop main method
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperStartStopApp Debug: stop main method completed
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: WrapperListener.stop runner thread stopped.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: returned from listener.stop() -> 1
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: shutdownJVM(1) Thread:Wrapper-Connection
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: wait for 0 shutdown locs to be released.
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Send a packet STOPPED : 1
INFO | jvm 2 | 2013/07/26 17:27:46 | WrapperManager Debug: Stopped checking for control events.
DEBUG | wrapper | 2013/07/26 17:27:46 | JVM was only running for 136 seconds leading to a failed restart count of 2.
DEBUG | wrapper | 2013/07/26 17:27:46 | Waiting 5 seconds before launching another JVM.

The whole piece of log is attached. The system is running on a vm box at our IT Infrastructure.
Linux hvsnite2 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:02 EDT 2007 i686 i686 i386 GNU/Linux

Thanks in advance for your help.

BR
Luke

1 Attachments

Discussion

  • Leif Mortenson

    Leif Mortenson - 2013-07-29

    Luke,
    Looking at the log, the Wrapper is receiving a SIGCHLD signal from someplace and mistakenly thinking that the Java process has terminated. This is causing the Wrapper to proceed with trying to cleanup after the JVM and closing down the backend socket, etc.

    3.4.0 is very old version of the Wrapper. I know we have made several changes in this area. Would it be possible for you to try this with a newer version of the Wrapper, preferably 3.5.20?

    If not, please let me know and we will try to reproduce this.

    Cheers,
    Leif

     
    • Luke Xu

      Luke Xu - 2013-07-29

      Hi Leif,

      The signal trap happened 17:27:46 which was one minute later than the last ping package read by wrapperp. I think it was thrown out by the JVM. It was printed in log first only because of the multi-processes racing condition, I think.

      About to update to a new version, I'm afraid our existing license is out of the time of free updating. Can we get a license of trail to see if the issue resolved in the new version before we make further decision?

      BR
      Luke

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.