|
From: Love, E. <EB...@ic...> - 2004-11-26 11:28:29
|
Thanks, Leif! I will give that a shot and let you know how it works =
out.
-----Original Message-----
From: wra...@li...
[mailto:wra...@li...]On Behalf Of Leif
Mortenson
Sent: Wednesday, November 24, 2004 7:12 PM
To: wra...@li...
Subject: Re: [Wrapper-user] <DEFUNCT> java processes on Solaris
Erik,
This looks like the same problem that had me stumped for about a=20
month. One of my
customers was experiencing this same problem.
After much debugging, the problem was being caused by the usleep()=20
system function.
Solaris appears to have a problem where the usleep function will not=20
always return if the
process receives a signal while sleeping. The fix was to switch from=20
using usleep() to
using nanosleep. This fix was implemented in version 3.1.2.
My customer had been experiencing the problem on a single machine=20
about once
every 72 hours. Since the fix was made it has been running for a=20
couple months without
any further problems.
Please try upgrading to 3.1.2 and giving it a try. You should not=20
need any other
configuration changes.
As a note, it appears that the usleep function would hang up until=20
the process received
another signal. At which point it would immediately start working=20
correctly again. The
problem is that the main event loop of the Wrapper was hung. It is=20
responsible for pinging
the JVM, processing JVM output etc.
In your log file, you see that log output stopped at '2004/11/22=20
13:02:14' and then
resumed at '2004/11/24 09:20:42' when the Wrapper received a signal. =20
The problem is
that at this point, the JVM had long given up and quit to let its=20
backend socket be
resynched by the Wrapper process. All related output from the JVM had=20
been queued
up in the Wrapper's pipe buffers for a couple days. This makes the JVM=20
appear to have
shutdown when the signal was received when in reality it had actualy=20
shutdown a couple
days earlier.
The only difference between your case and what I had seen was that=20
in your case,
the Wrapper is exiting. In my case, the Wrapper was launching a new=20
JVM after
recovering. This may have something to do with your WrapperListener=20
implementation.
I had been using the WrapperSimpleApp class.
Cheer,
Leif
Love, Erik wrote:
>Hi,
>
>I am using Wrapper 3.1.1 on a Sun Solaris 8 platform and using java =
1.4.2_04-b05. The wrapper kicks off a scheduling program which runs =
great for a period of time and then without any warning or apparent =
cause goes into a defunct state. Most recently, it shut down the =
afternoon of the 22nd and didn't recover until the process was manually =
killed today by the unix admin. After the admin killed the process, the =
wrapper wrote the Wrapper Process has not received any CPU time for =
159507 seconds message to the log...
>
>I have looked at the info regarding the new tick timer, but it doesn't =
seem to be the same issue. Any ideas would be greatly appreciated. I =
have included an excerpt from the log (with the debug option turned =
on)... A side note, we have another wrapper-based application that runs =
as an NT service. This application has been running since December 2003 =
and has never experienced this issue.
>
>2004/11/22 13:02:14 | send a packet PING : ping
>2004/11/22 13:02:14 | Received a packet PING : ping
>2004/11/22 13:02:14 | Send a packet PING : ok
>2004/11/22 13:02:14 | read a packet PING : ok
>2004/11/22 13:02:14 | Got ping response from JVM
>2004/11/24 09:20:42 | Signal trapped. Details:
>2004/11/24 09:20:42 | signal number=3D15, source=3D"kill, sigsend or =
raise"
>2004/11/24 09:20:42 | signal generated by PID: 22857 (Session PID: =
22813), UID: 0 (root)
>2004/11/24 09:20:42 | TERM trapped. Shutting down.
>2004/11/24 09:20:42 | wrapperStopProcess(0) called.
>2004/11/24 09:20:42 | Read Timed out. (Last Ping was 30000 milliseconds =
ago)
>2004/11/24 09:20:42 | Still alive
>2004/11/24 09:20:42 | Read Timed out. (Last Ping was 60000 milliseconds =
ago)
>2004/11/24 09:20:42 | Wrapper Manager: The Wrapper code did not ping =
the JVM for 60 seconds. Quit and let the Wrapper resynch.
>2004/11/24 09:20:42 | Send a packet RESTART : restart
>2004/11/24 09:20:42 | Thread, Wrapper-Connection, handling the shutdown =
process.
>2004/11/24 09:20:42 | calling listener.stop()
>2004/11/24 09:20:42 | Stopping the RC Scheduler Application
>2004/11/24 09:20:42 | returned from listener.stop()
>2004/11/24 09:20:42 | Send a packet STOPPED : 0
>2004/11/24 09:20:42 | Closing socket.
>2004/11/24 09:20:42 | calling System.exit(1)
>2004/11/24 09:20:42 | read a packet RESTART : restart
>2004/11/24 09:20:42 | JVM requested a restart.
>2004/11/24 09:20:42 | wrapperRestartProcess() called. (IGNORED)
>2004/11/24 09:20:42 | read a packet STOPPED : 0
>2004/11/24 09:20:42 | JVM signalled that it was stopped.
>2004/11/24 09:20:42 | socket read no code (closed?).
>2004/11/24 09:20:42 | Wrapper Process has not received any CPU time for =
159507 seconds. Extending timeouts.
>2004/11/24 09:20:42 | JVM process exited with a code of 1, setting the =
wrapper exit code to 1.
>2004/11/24 09:20:42 | JVM exited normally.
>2004/11/24 09:20:42 | <-- Wrapper Stopped
>2004/11/24 09:20:45 | Spawning intermediate process...
>2004/11/24 09:20:45 | Spawning daemon process...
>2004/11/24 09:20:45 | --> Wrapper Started as Daemon
>2004/11/24 09:20:45 | Using system timer.
>2004/11/24 09:20:45 | server listening on port 32000.
>2004/11/24 09:20:45 | Classpath element, wrapper.java.classpath.11, =
does not exist: /apps/java/ira.jar
>2004/11/24 09:20:45 | Classpath element, wrapper.java.classpath.24, =
does not exist: /apps/java/output.txt
>2004/11/24 09:20:45 | Command[0] : java
>2004/11/24 09:20:45 | Command[1] : -Xms256m
>2004/11/24 09:20:45 | Command[2] : -Xmx768m
>2004/11/24 09:20:45 | Command[3] : =
-Djava.library.path=3D/apps/RCScheduler/lib
>2004/11/24 09:20:45 | Command[4] : -classpath
>2004/11/24 09:20:45 | Command[5] : =
../lib/wrapper.jar:/apps/java/rcscheduler.jar:/apps/java/vantage.jar:/app=
s/java/vantageadmin.jar:/apps/java/rcmailings.jar:/apps/java/acme.jar:/ap=
ps/java/activation.jar:/apps/java/avalon.jar:/apps/java/batik.jar:/apps/j=
ava/fop.jar:/apps/java/ira.jar:/apps/java/itext.jar:/apps/java/j2ee.jar:/=
apps/java/jai_codec.jar:/apps/java/jakarta-oro-2.0.7.jar:/apps/java/jdom.=
jar:/apps/java/jlink.jar:/apps/java/log4j.jar:/apps/java/mail.jar:/apps/j=
ava/netcharts.jar:/apps/java/NetComponents-1_3_8.jar:/apps/java/ojdbc14.j=
ar:/apps/java/oreilly.jar:/apps/java/output.txt:/apps/java/pja.jar:/apps/=
java/pjatools.jar:/apps/java/Tidy.jar:/apps/java/wrapper.jar:/apps/java/x=
alan.jar:/apps/java/xerces.jar:/apps/java/xml-apis.jar:/apps/java/xp-asse=
ts.jar:/apps/java/weblogic.jar
>2004/11/24 09:20:45 | Command[6] : -Dwrapper.key=3D4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:45 | Command[7] : -Dwrapper.port=3D32000
>2004/11/24 09:20:45 | Command[8] : -Dwrapper.debug=3DTRUE
>2004/11/24 09:20:45 | Command[9] : -Dwrapper.use_system_time=3DTRUE
>2004/11/24 09:20:45 | Command[10] : -Dwrapper.version=3D3.1.1
>2004/11/24 09:20:45 | Command[11] : -Dwrapper.native_library=3Dwrapper
>2004/11/24 09:20:45 | Command[12] : -Dwrapper.service=3DTRUE
>2004/11/24 09:20:45 | Command[13] : -Dwrapper.cpu.timeout=3D10
>2004/11/24 09:20:45 | Command[14] : -Dwrapper.jvmid=3D1
>2004/11/24 09:20:45 | Command[15] : =
org.icmarc.scheduler.application.Scheduler
>2004/11/24 09:20:45 | Command[16] : ./RCScheduler.cfg
>2004/11/24 09:20:45 | Launching a JVM...
>2004/11/24 09:20:46 | WrapperManager class initialized by thread: main =
Using classloader: sun.misc.Launcher$AppClassLoader@67ac19
>2004/11/24 09:20:46 | Wrapper Manager: JVM #1
>2004/11/24 09:20:46 | Wrapper Manager: Registering shutdown hook
>2004/11/24 09:20:46 | Wrapper Manager: Using wrapper
>2004/11/24 09:20:47 | Loaded native library: libwrapper.so
>2004/11/24 09:20:47 | Calling native initialization method.
>2004/11/24 09:20:47 | Inside native WrapperManager initialization =
method
>2004/11/24 09:20:47 | Java Version : 1.4.2_04-b05 Java HotSpot(TM) =
Client VM
>2004/11/24 09:20:47 | Java VM Vendor : Sun Microsystems Inc.
>2004/11/24 09:20:47 |=20
>2004/11/24 09:20:47 | Wrapper (Version 3.1.1) =
http://wrapper.tanukisoftware.org
>2004/11/24 09:20:47 |=20
>2004/11/24 09:20:47 | =
WrapperManager.start(org.icmarc.scheduler.application.Scheduler@e102dc, =
args["./RCScheduler.cfg"]) called by thread: main
>2004/11/24 09:20:47 | Open socket to wrapper...
>2004/11/24 09:20:47 | Opened Socket
>2004/11/24 09:20:47 | Send a packet KEY : 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | =
handleSocket(Socket[addr=3D/127.0.0.1,port=3D32000,localport=3D36727])
>2004/11/24 09:20:47 | accepted a socket from 127.0.0.1 on port 36727
>2004/11/24 09:20:47 | read a packet KEY : 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | Got key from JVM: 4Nq8kvz4U0jqZd7N
>2004/11/24 09:20:47 | send a packet LOW_LOG_LEVEL : 1
>2004/11/24 09:20:47 | send a packet PING_TIMEOUT : 30
>2004/11/24 09:20:47 | Start Application.
>2004/11/24 09:20:47 | send a packet START : start
>2004/11/24 09:20:47 | Received a packet LOW_LOG_LEVEL : 1
>2004/11/24 09:20:47 | Wrapper Manager: LowLogLevel from Wrapper is 1
>2004/11/24 09:20:47 | Received a packet PING_TIMEOUT : 30
>2004/11/24 09:20:47 | Wrapper Manager: PingTimeout from Wrapper is =
30000
>2004/11/24 09:20:47 | Received a packet START : start
>2004/11/24 09:20:47 | calling listener.start()
>2004/11/24 09:20:48 | Initializing the RC Scheduler Application
>
>
>-------------------------------------------------------
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real =
users.
>Discover which products truly live up to the hype. Start reading now.=20
>http://productguide.itmanagersjournal.com/
>_______________________________________________
>Wrapper-user mailing list
>Wra...@li...
>https://lists.sourceforge.net/lists/listinfo/wrapper-user
>
> =20
>
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.=20
http://productguide.itmanagersjournal.com/
_______________________________________________
Wrapper-user mailing list
Wra...@li...
https://lists.sourceforge.net/lists/listinfo/wrapper-user
|