|
From: Leif M. <le...@ta...> - 2010-04-23 06:50:06
|
Tomer, Thank you for the logs. I am looking through them. One thing that I do see is that your native library is not being loaded correctly: --- INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: WARNING - Unable to load the Wrapper's native library 'libwrapper.so'. INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: The file is located on the path at the following location but INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: could not be loaded: INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: /opt/javatest/./libwrapper.so INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: Please verify that the file is readable by the current user INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: and that the file has not been corrupted in any way. INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: One common cause of this problem is running a 32-bit version INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: of the Wrapper with a 64-bit version of Java, or vica versa. INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: This is a 32-bit JVM. INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: Reported cause: INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: /opt/javatest/libwrapper.so: ld.so.1: java: fatal: /opt/javatest/libwrapper.so: wrong ELF class: ELFCLASS64 (Possible cause: architecture word width misma$ INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager: System signals will not be handled correctly. --- There are a few things in the logs which are confusing me as well: 1) In the first post that you sent, there were log entries like the following stating that the Wrapper had not pinged the JVM for almost 5 minutes and that the socket read times out. This actually made some sense: --- INFO | jvm 1 | 2010/04/20 22:21:13 | WrapperManager Debug: Read Timed out. (Last Ping was 298400 milliseconds ago) --- In this latest log file however, this time has become 0 in all cases: --- INFO | jvm 1 | 2010/04/22 18:31:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) --- Thar time is calculated using the internal timing mechanism of the the Wrapper and the only way that it could be 0 is if the WrapperManager's Event Monitor thread was not running. I can see that that thread started because of the following line: --- INFO | jvm 1 | 2010/04/22 16:21:24 | WrapperManager Debug: Control event monitor thread started. --- There is no message that it was ever stopped however which doesn't make sense. 2) You have set your wrapper.ping.timeout=300. This means that regardless of what may go wrong in the JVM, the Wrapper should be killing the JVM if it fails to respond to a ping within 300 seconds. >From the following subset of your logs, you can see that the Wrapper is allowing the JVM to run for MUCH longer than the configured 300 seconds: --- DEBUG | wrapperp | 2010/04/22 18:26:19 | read a packet PING : ping INFO | jvm 1 | 2010/04/22 18:31:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) INFO | jvm 1 | 2010/04/22 18:36:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) ... INFO | jvm 1 | 2010/04/22 19:21:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) INFO | jvm 1 | 2010/04/22 19:26:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) INFO | jvm 1 | 2010/04/22 19:31:19 | WrapperManager Debug: Read Timed out. (Last Ping was 0 milliseconds ago) --- Because the ping timeout is 300 seconds, the WrapperManager's socket read is correctly timing out every 300 seconds perfectly. I would question whether or not there was a problem in the Wrapper itself except for the fact that the thread that is checking for ping timeouts is the same thread that is processing the JVM's log output. The log output is making it to the log file so it is working correctly. We are still spending a lot of time trying to get to the bottom of this, but because this is the Community Edition, I need to ask if this is the binary that we released, or if this is a version that you have compiled from source yourself, possibly after some modifications? Knowing that it is an unmodified binary would be helpful in tracking down the cause of this issue. The code involved in the JVM has been unmodified since 2003 and there have been no other reports of problems when using the Tick Timer. The old System based timed was susceptible to high system load, but such loads slow down the rate of the ticks and avoid encountering any timeouts. (This doesn't mean that there isn't a problem, just that you are the first case I have seen.) We will run some tests using your test class on Solaris today as well. Thanks in advance, Cheers, Leif On Fri, Apr 23, 2010 at 1:52 AM, Tomer B <tom...@gm...> wrote: > Hi, > > I created a dummy java app that prints the time every 5 seconds and > allocates some memory. > I ran it for some time and although it did not halt I got the same behaviour > with manners of the jvm stopped receiving pings from wrapper after some > time! (which is what happens to my server). > I'm attaching the java app code + conf + logs > > Attaching a zip containing the whole thing. <snip> |