|
From: Leif M. <le...@ta...> - 2004-11-25 00:12:41
|
Erik,
This looks like the same problem that had me stumped for about a
month. One of my
customers was experiencing this same problem.
After much debugging, the problem was being caused by the usleep()
system function.
Solaris appears to have a problem where the usleep function will not
always return if the
process receives a signal while sleeping. The fix was to switch from
using usleep() to
using nanosleep. This fix was implemented in version 3.1.2.
My customer had been experiencing the problem on a single machine
about once
every 72 hours. Since the fix was made it has been running for a
couple months without
any further problems.
Please try upgrading to 3.1.2 and giving it a try. You should not
need any other
configuration changes.
As a note, it appears that the usleep function would hang up until
the process received
another signal. At which point it would immediately start working
correctly again. The
problem is that the main event loop of the Wrapper was hung. It is
responsible for pinging
the JVM, processing JVM output etc.
In your log file, you see that log output stopped at '2004/11/22
13:02:14' and then
resumed at '2004/11/24 09:20:42' when the Wrapper received a signal.
The problem is
that at this point, the JVM had long given up and quit to let its
backend socket be
resynched by the Wrapper process. All related output from the JVM had
been queued
up in the Wrapper's pipe buffers for a couple days. This makes the JVM
appear to have
shutdown when the signal was received when in reality it had actualy
shutdown a couple
days earlier.
The only difference between your case and what I had seen was that
in your case,
the Wrapper is exiting. In my case, the Wrapper was launching a new
JVM after
recovering. This may have something to do with your WrapperListener
implementation.
I had been using the WrapperSimpleApp class.
Cheer,
Leif
Love, Erik wrote:
>Hi,
>
>I am using Wrapper 3.1.1 on a Sun Solaris 8 platform and using java 1.4.2_04-b05. The wrapper kicks off a scheduling program which runs great for a period of time and then without any warning or apparent cause goes into a defunct state. Most recently, it shut down the afternoon of the 22nd and didn't recover until the process was manually killed today by the unix admin. After the admin killed the process, the wrapper wrote the Wrapper Process has not received any CPU time for 159507 seconds message to the log...
>
>I have looked at the info regarding the new tick timer, but it doesn't seem to be the same issue. Any ideas would be greatly appreciated. I have included an excerpt from the log (with the debug option turned on)... A side note, we have another wrapper-based application that runs as an NT service. This application has been running since December 2003 and has never experienced this issue.
>
>2004/11/22 13:02:14 | send a packet PING : ping
>2004/11/22 13:02:14 | Received a packet PING : ping
>2004/11/22 13:02:14 | Send a packet PING : ok
>2004/11/22 13:02:14 | read a packet PING : ok
>2004/11/22 13:02:14 | Got ping response from JVM
>2004/11/24 09:20:42 | Signal trapped. Details:
>2004/11/24 09:20:42 | signal number=15, source="kill, sigsend or raise"
>2004/11/24 09:20:42 | signal generated by PID: 22857 (Session PID: 22813), UID: 0 (root)
>2004/11/24 09:20:42 | TERM trapped. Shutting down.
>2004/11/24 09:20:42 | wrapperStopProcess(0) called.
>2004/11/24 09:20:42 | Read Timed out. (Last Ping was 30000 milliseconds ago)
>2004/11/24 09:20:42 | Still alive
>2004/11/24 09:20:42 | Read Timed out. (Last Ping was 60000 milliseconds ago)
>2004/11/24 09:20:42 | Wrapper Manager: The Wrapper code did not ping the JVM for 60 seconds. Quit and let the Wrapper resynch.
>2004/11/24 09:20:42 | Send a packet RESTART : restart
>2004/11/24 09:20:42 | Thread, Wrapper-Connection, handling the shutdown process.
>2004/11/24 09:20:42 | calling listener.stop()
>2004/11/24 09:20:42 | Stopping the RC Scheduler Application
>2004/11/24 09:20:42 | returned from listener.stop()
>2004/11/24 09:20:42 | Send a packet STOPPED : 0
>2004/11/24 09:20:42 | Closing socket.
>2004/11/24 09:20:42 | calling System.exit(1)
>2004/11/24 09:20:42 | read a packet RESTART : restart
>2004/11/24 09:20:42 | JVM requested a restart.
>2004/11/24 09:20:42 | wrapperRestartProcess() called. (IGNORED)
>2004/11/24 09:20:42 | read a packet STOPPED : 0
>2004/11/24 09:20:42 | JVM signalled that it was stopped.
>2004/11/24 09:20:42 | socket read no code (closed?).
>2004/11/24 09:20:42 | Wrapper Process has not received any CPU time for 159507 seconds. Extending timeouts.
>2004/11/24 09:20:42 | JVM process exited with a code of 1, setting the wrapper exit code to 1.
>2004/11/24 09:20:42 | JVM exited normally.
>2004/11/24 09:20:42 | <-- Wrapper Stopped
>2004/11/24 09:20:45 | Spawning intermediate process...
>2004/11/24 09:20:45 | Spawning daemon process...
>2004/11/24 09:20:45 | --> Wrapper Started as Daemon
>2004/11/24 09:20:45 | Using system timer.
>2004/11/24 09:20:45 | server listening on port 32000.
>2004/11/24 09:20:45 | Classpath element, wrapper.java.classpath.11, does not exist: /apps/java/ira.jar
>2004/11/24 09:20:45 | Classpath element, wrapper.java.classpath.24, does not exist: /apps/java/output.txt
>2004/11/24 09:20:45 | Command[0] : java
>2004/11/24 09:20:45 | Command[1] : -Xms256m
>2004/11/24 09:20:45 | Command[2] : -Xmx768m
>2004/11/24 09:20:45 | Command[3] : -Djava.library.path=/apps/RCScheduler/lib
>2004/11/24 09:20:45 | Command[4] : -classpath
>2004/11/24 09:20:45 | Command[5] : ../lib/wrapper.jar:/apps/java/rcscheduler.jar:/apps/java/vantage.jar:/apps/java/vantageadmin.jar:/apps/java/rcmailings.jar:/apps/java/acme.jar:/apps/java/activation.jar:/apps/java/avalon.jar:/apps/java/batik.jar:/apps/java/fop.jar:/apps/java/ira.jar:/apps/java/itext.jar:/apps/java/j2ee.jar:/apps/java/jai_codec.jar:/apps/java/jakarta-oro-2.0.7.jar:/apps/java/jdom.jar:/apps/java/jlink.jar:/apps/java/log4j.jar:/apps/java/mail.jar:/apps/java/netcharts.jar:/apps/java/NetComponents-1_3_8.jar:/apps/java/ojdbc14.jar:/apps/java/oreilly.jar:/apps/java/output.txt:/apps/java/pja.jar:/apps/java/pjatools.jar:/apps/java/Tidy.jar:/apps/java/wrapper.jar:/apps/java/xalan.jar:/apps/java/xerces.jar:/apps/java/xml-apis.jar:/apps/java/xp-assets.jar:/apps/java/weblogic.jar
>2004/11/24 09:20:45 | Command[6] : -Dwrapper.key=4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:45 | Command[7] : -Dwrapper.port=32000
>2004/11/24 09:20:45 | Command[8] : -Dwrapper.debug=TRUE
>2004/11/24 09:20:45 | Command[9] : -Dwrapper.use_system_time=TRUE
>2004/11/24 09:20:45 | Command[10] : -Dwrapper.version=3.1.1
>2004/11/24 09:20:45 | Command[11] : -Dwrapper.native_library=wrapper
>2004/11/24 09:20:45 | Command[12] : -Dwrapper.service=TRUE
>2004/11/24 09:20:45 | Command[13] : -Dwrapper.cpu.timeout=10
>2004/11/24 09:20:45 | Command[14] : -Dwrapper.jvmid=1
>2004/11/24 09:20:45 | Command[15] : org.icmarc.scheduler.application.Scheduler
>2004/11/24 09:20:45 | Command[16] : ./RCScheduler.cfg
>2004/11/24 09:20:45 | Launching a JVM...
>2004/11/24 09:20:46 | WrapperManager class initialized by thread: main Using classloader: sun.misc.Launcher$AppClassLoader@67ac19
>2004/11/24 09:20:46 | Wrapper Manager: JVM #1
>2004/11/24 09:20:46 | Wrapper Manager: Registering shutdown hook
>2004/11/24 09:20:46 | Wrapper Manager: Using wrapper
>2004/11/24 09:20:47 | Loaded native library: libwrapper.so
>2004/11/24 09:20:47 | Calling native initialization method.
>2004/11/24 09:20:47 | Inside native WrapperManager initialization method
>2004/11/24 09:20:47 | Java Version : 1.4.2_04-b05 Java HotSpot(TM) Client VM
>2004/11/24 09:20:47 | Java VM Vendor : Sun Microsystems Inc.
>2004/11/24 09:20:47 |
>2004/11/24 09:20:47 | Wrapper (Version 3.1.1) http://wrapper.tanukisoftware.org
>2004/11/24 09:20:47 |
>2004/11/24 09:20:47 | WrapperManager.start(org.icmarc.scheduler.application.Scheduler@e102dc, args["./RCScheduler.cfg"]) called by thread: main
>2004/11/24 09:20:47 | Open socket to wrapper...
>2004/11/24 09:20:47 | Opened Socket
>2004/11/24 09:20:47 | Send a packet KEY : 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | handleSocket(Socket[addr=/127.0.0.1,port=32000,localport=36727])
>2004/11/24 09:20:47 | accepted a socket from 127.0.0.1 on port 36727
>2004/11/24 09:20:47 | read a packet KEY : 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | Got key from JVM: 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | send a packet LOW_LOG_LEVEL : 1
>2004/11/24 09:20:47 | send a packet PING_TIMEOUT : 30
>2004/11/24 09:20:47 | Start Application.
>2004/11/24 09:20:47 | send a packet START : start
>2004/11/24 09:20:47 | Received a packet LOW_LOG_LEVEL : 1
>2004/11/24 09:20:47 | Wrapper Manager: LowLogLevel from Wrapper is 1
>2004/11/24 09:20:47 | Received a packet PING_TIMEOUT : 30
>2004/11/24 09:20:47 | Wrapper Manager: PingTimeout from Wrapper is 30000
>2004/11/24 09:20:47 | Received a packet START : start
>2004/11/24 09:20:47 | calling listener.start()
>2004/11/24 09:20:48 | Initializing the RC Scheduler Application
>
>
>-------------------------------------------------------
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real users.
>Discover which products truly live up to the hype. Start reading now.
>http://productguide.itmanagersjournal.com/
>_______________________________________________
>Wrapper-user mailing list
>Wra...@li...
>https://lists.sourceforge.net/lists/listinfo/wrapper-user
>
>
>
|