Re: [Wrapper-user] no JVM running (state: DOWN_CLEAN) on linux after OOM

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Christoph

1) Could you please send me the wrapper.log file with debug output enabled
(wrapper.debug=true) that shows what is happening when the Wrapper is
failing to restart the JVM?
Please include the part of the log showing the last few moments of the JVM
that runs out of memory as well.

2) What version of the Wrapper are you running?
The following issue was fixed in 3.5.16 and sounds like it might be what
you are seeing.
https://wrapper.tanukisoftware.com/doc/english/release-notes.html#3.5.16
---
Fix a problem where a JVM process was not stopped completely on a UNIX
platform and stayed defunct after a forced kill until the Wrapper process
itself stopped. This was especially noticeable if the JVM is frozen and the
JVM is being killed forcibly.
---
Are you seeing a zombie Java process still running?
This bug meant that the JVM was being left around in the background when
the Wrapper thought it was gone.
If you are out of memory then the next JVM would not have enough memory to
launch.
If the first JVM is not actually frozen, it would shut itself down after
losing its backend connection to the Wrapper.  But that might be happening
too late and result in what you are seeing.

3) The DOWN_CLEAN state means that the Wrapper has completely shutdown the
JVM and cleaned up any associated resources.
We will take a look at the documentation on the following page as you are
correct that it is missing some information.
https://wrapper.tanukisoftware.com/doc/english/prop-java-statusfile.html

Cheers,
Leif

On Tue, Jan 8, 2019 at 8:32 PM Christoph SCHWAIGER <csc...@am...>
wrote:

> CONFIDENTIAL & RESTRICTED
>
> Hello Leif,
>
>
>
> Thanks for the information about the subscription. I did so.
>
>
>
> We have been using the wrapper on windows for many years, since a couple
> of years we have a standard support version.
>
>
>
> Our problem is on linux RH. *After an out of memory situation (the jvm
> exited) it is not restarted and remains down indefinitely, the status
> script exits with status zero*, so all looks up for the cluster.
> (integrated into veritas cluster). The OOM was bad: not related to JVM, but
> caused by overly optimistic ulimits of the user - that has been corrected.
>
>
>
> STATUS | wrapper  | 2019/01/07 13:38:41 | Launching a JVM...
>
> INFO   | jvm 1    | 2019/01/07 13:38:43 | WrapperManager: Initializing...
>
> INFO   | jvm 1    | 2019/01/07 13:38:45 | S-Check version 3.0.4 Monte Rosa
> from 12-Sep-2018 08:02 by cschwaiger
>
> INFO   | jvm 1    | 2019/01/07 13:38:45 | Scheck is starting on server
> MUCTXP5B
>
> INFO   | jvm 1    | 2019/01/07 13:38:52 | parsed 1 xml files and created 0
> service records.
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 | Exception in thread
> "InactivityMonitor WriteCheck" java.lang.OutOfMemoryError: unable to create
> new native thread
>
> STATUS | wrapper  | 2019/01/07 15:02:11 | The JVM has run out of memory.
> Restarting JVM.
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.lang.Thread.start0(Native Method)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.lang.Thread.start(Thread.java:717)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> org.apache.activemq.transport.InactivityMonitor.writeCheck(InactivityMonitor.java:147)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> org.apache.activemq.transport.InactivityMonitor$2.run(InactivityMonitor.java:113)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.util.TimerThread.mainLoop(Timer.java:555)
>
> INFO   | jvm 1    | 2019/01/07 15:02:11 |       at
> java.util.TimerThread.run(Timer.java:505)
>
> ERROR  | wrapper  | 2019/01/07 15:02:45 | Shutdown failed: Timed out
> waiting for signal from JVM.
>
> ERROR  | wrapper  | 2019/01/07 15:02:46 | JVM did not exit on request,
> termination requested.
>
> STATUS | wrapper  | 2019/01/07 15:02:46 | JVM received a signal SIGKILL
> (9).
>
> STATUS | wrapper  | 2019/01/07 15:02:46 | JVM process is gone.
>
> STATUS | wrapper  | 2019/01/07 15:02:46 | JVM exited after being requested
> to terminate.
>
> STATUS | wrapper  | 2019/01/07 15:02:50 | Reloading Wrapper
> configuration...
>
> STATUS | wrapper  | 2019/01/07 15:02:50 | Launching a JVM...
>
>
>
> [scheck@muctxp5b scheck_unix11]$ ./scheck.sh status
>
> *Service check monitoring instance (not installed) is running: PID:56766,
> Wrapper:STARTED, Java:DOWN_CLEAN*
>
>
>
> I could not find the DOWN_CLEAN state documented – looked at:
> https://wrapper.tanukisoftware.com/doc/english/prop-java-statusfile.html
>
>
>
> ”scheck.sh stop” fails – indefinitely waits for wrapper to stop. A simple
> kill <pid> terminates it.
>
>
>
> Any recommendations – i.e. measures to avoid  hanging in the “looks good =
> status zero, but down” state?
>
>
>
> Below/attached is the information about os version and configuration.
>
>
>
> Thanks in advance,
>
> Christoph
>
>
>
> Linux muctxp5b 2.6.32-754.3.5.el6.x86_64 #1 SMP Thu Aug 9 11:56:22 EDT
> 2018 x86_64 x86_64 x86_64 GNU/Linux
>
>
>