From: Steve L. (JIRA) <ji...@sm...> - 2010-01-19 10:59:25
|
[ http://jira.smartfrog.org/jira/browse/SFOS-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12016#action_12016 ] Steve Loughran commented on SFOS-1414: -------------------------------------- stack [sf-startdaemon-debug] 2010/01/18 17:16:33:256 GMT [INFO ][RMI TCP Connection(4)-16.25.175.158] HOST morzine.hpl.hp.com:rootProcess:mombasa - Deploying application worker to su...@wo...:22 [sf-startdaemon-debug] 2010/01/19 10:46:24:531 GMT [ERROR][TerminatorThread] SFCORE_LOG - TerminatorThread.sfTerminateQuietlyWith failed [Termination Record: HOST morzine.hpl.hp.com:rootProcess:mombasa:subprocess:subprocess, type: abnormal, description: Liveness Send Failure in HOST morzine.hpl.hp.com:rootProcess:mombasa:subprocess:subprocess when calling null (Failure: Exception creating connection to: 16.25.175.158; nested exception is: [sf-startdaemon-debug] java.net.SocketException: Network is unreachable) [sf-startdaemon-debug] java.rmi.ConnectIOException: Exception creating connection to: 16.25.175.158; nested exception is: [sf-startdaemon-debug] java.net.SocketException: Network is unreachable [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:614) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) [sf-startdaemon-debug] at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl_Stub.sfPing(Unknown Source) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChild(CompoundImpl.java:874) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildAndTerminateOnFailure(CompoundImpl.java:857) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildren(CompoundImpl.java:844) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPing(CompoundImpl.java:831) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChild(CompoundImpl.java:874) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildAndTerminateOnFailure(CompoundImpl.java:857) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildren(CompoundImpl.java:844) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPing(CompoundImpl.java:831) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChild(CompoundImpl.java:874) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildAndTerminateOnFailure(CompoundImpl.java:857) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPingChildren(CompoundImpl.java:844) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl.sfPing(CompoundImpl.java:831) [sf-startdaemon-debug] at org.smartfrog.sfcore.prim.LivenessSender.timerTick(LivenessSender.java:61) [sf-startdaemon-debug] at org.smartfrog.sfcore.common.Timer.doTick(Timer.java:143) [sf-startdaemon-debug] at org.smartfrog.sfcore.common.Timer.run(Timer.java:170) [sf-startdaemon-debug] at java.lang.Thread.run(Thread.java:619) [sf-startdaemon-debug] Caused by: java.net.SocketException: Network is unreachable [sf-startdaemon-debug] at java.net.PlainSocketImpl.socketConnect(Native Method) [sf-startdaemon-debug] at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) [sf-startdaemon-debug] at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) [sf-startdaemon-debug] at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) [sf-startdaemon-debug] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) [sf-startdaemon-debug] at java.net.Socket.connect(Socket.java:525) [sf-startdaemon-debug] at java.net.Socket.connect(Socket.java:475) [sf-startdaemon-debug] at java.net.Socket.<init>(Socket.java:372) [sf-startdaemon-debug] at java.net.Socket.<init>(Socket.java:186) [sf-startdaemon-debug] at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) [sf-startdaemon-debug] at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:143) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) [sf-startdaemon-debug] ... 20 more [sf-startdaemon-debug] ] <java.rmi.ConnectIOException: Exception creating connection to: 16.25.175.158; nested exception is: [sf-startdaemon-debug] java.net.SocketException: Network is unreachable> [sf-startdaemon-debug] java.rmi.ConnectIOException: Exception creating connection to: 16.25.175.158; nested exception is: [sf-startdaemon-debug] java.net.SocketException: Network is unreachable [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:614) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) [sf-startdaemon-debug] at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) [sf-startdaemon-debug] at org.smartfrog.sfcore.compound.CompoundImpl_Stub.sfTerminateQuietlyWith(Unknown Source) [sf-startdaemon-debug] at org.smartfrog.sfcore.common.TerminatorThread.execute(TerminatorThread.java:211) [sf-startdaemon-debug] at org.smartfrog.sfcore.utils.SmartFrogThread.run(SmartFrogThread.java:278) [sf-startdaemon-debug] Caused by: java.net.SocketException: Network is unreachable [sf-startdaemon-debug] at java.net.PlainSocketImpl.socketConnect(Native Method) [sf-startdaemon-debug] at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) [sf-startdaemon-debug] at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) [sf-startdaemon-debug] at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) [sf-startdaemon-debug] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) [sf-startdaemon-debug] at java.net.Socket.connect(Socket.java:525) [sf-startdaemon-debug] at java.net.Socket.connect(Socket.java:475) [sf-startdaemon-debug] at java.net.Socket.<init>(Socket.java:372) [sf-startdaemon-debug] at java.net.Socket.<init>(Socket.java:186) [sf-startdaemon-debug] at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) [sf-startdaemon-debug] at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:143) [sf-startdaemon-debug] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) [sf-startdaemon-debug] ... 6 more [sf-startdaemon-debug] 2010/01/19 10:46:24:532 GMT [ERROR][TerminatorThread] SFCORE_LOG - This may be harmless -and caused by the far end closing down [sf-startdaemon-debug] 2010/01/19 10:48:54:387 GMT [INFO ][LivenessSender_HOST morzine.hpl.hp.com:rootProcess:mombasa:*unknown*:*unknown*:*unknown*] HOST morzine.hpl.hp.com:rootProcess:mombasa - LivenessFailure as parent liveness checking had counted down: in HOST morzine.hpl.hp.com:rootProcess:mombasa:subprocess:subprocess:jetty [sf-startdaemon-debug] 2010/01/19 10:48:54:549 GMT [INFO ][LivenessSender_HOST morzine.hpl.hp.com:rootProcess:mombasa:*unknown*:*unknown*:*unknown*] HOST morzine.hpl.hp.com:rootProcess:mombasa - Terminating Jetty servlet context > Liveness failure when a suspended machine resumes > ------------------------------------------------- > > Key: SFOS-1414 > URL: http://jira.smartfrog.org/jira/browse/SFOS-1414 > Project: SmartFrog > Issue Type: Bug > Components: .sfCore > Affects Versions: 3.17.x > Environment: Linux, suspended through the gnome power manager > Reporter: Steve Loughran > Assignee: Julio Guijarro > > Liveness handling on a single machine failed when the workstation was suspended overnight and then resumed. The stack trace complains about network errors. > Possible causes > * It's been 18 hours since the last successful ping from the parent process, so assume failure > * network is not fully resumed yet. > It may be tricky to detect this, as Java doesn't forward power events to the system. Perhaps there could be some detection in the root processes of a possible power cycle (a very long gap since anything happened without any earlier timeouts. For example if it is suddenly 3 h since the last ping. If the process is very, very late on liveness then it can't have been checking beforehand, which is only possible if the host is suspended or the process somehow suspended (debugger, control-Z, ...). -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.smartfrog.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |