From: Steve L. (JIRA) <ji...@sm...> - 2008-10-27 17:17:41
|
[ http://jira.smartfrog.org/jira/browse/SFOS-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11339#action_11339 ] Steve Loughran commented on SFOS-1015: -------------------------------------- Full log {code} [sf-startdaemon-debug] SmartFrog 3.12.045dev (2008-10-22 15:24:08 BST) [sf-startdaemon-debug] (C) Copyright 1998-2008 Hewlett-Packard Development Company, LP [sf-startdaemon-debug] 2008/10/27 17:15:04:322 GMT [WARN ][Main Thread] SFCORE_LOG - SmartFrog security is NOT active [sf-startdaemon-debug] 2008/10/27 17:15:04:596 GMT [WARN ][Main Thread] SFCORE_LOG - Warning: stack trace logging enabled [sf-startdaemon-debug] SmartFrog ready... [rootProcess:3800] Mon Oct 27 17:15:05 GMT 2008 [sf-system-test-junit] Testsuite: org.smartfrog.services.hadoop.test.system.local.datanode.DataNodeTest [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.NameNode : State change: NameNode is now STARTED [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-7] INFO datanode.DataNode : State change: DataNode is now STARTED [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO metrics.RpcMetrics : Initializing RPC Metrics with hostName=ExtNameNode, port=8020 [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.NameNode : Namenode up at: localhost/127.0.0.1:8020 [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO jvm.JvmMetrics : Initializing JVM Metrics with processName=NameNode, sessionId=null [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO metrics.NameNodeMetrics : Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.FSNamesystem : fsOwner=slo,users,adm,dialout,fax,cdrom,floppy,tape,audio,dip,plugdev,scanner,admin,fuse [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.FSNamesystem : supergroup=supergroup [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.FSNamesystem : isPermissionEnabled=false [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO metrics.FSNamesystemMetrics : Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO namenode.FSNamesystem : Registered FSNamesystemStatusMBean [sf-startdaemon-debug] 08/10/27 17:15:10 [Thread-6] INFO common.Storage : Image file of size 93 saved in 0 seconds. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Storage directory /tmp/tempdir10739 has been successfully formatted. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Storage directory /tmp/hadoop/dfs/name has been successfully formatted. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Number of files = 0 [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Number of files under construction = 0 [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Image file of size 93 loaded in 0 seconds. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Edits file /tmp/hadoop/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO common.Storage : Image file of size 93 saved in 0 seconds. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO namenode.FSNamesystem : Finished loading FSImage in 212 msecs [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO hdfs.StateChange : STATE* Leaving safe mode after 0 secs. [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO hdfs.StateChange : STATE* Network topology has 0 racks and 0 datanodes [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO hdfs.StateChange : STATE* UnderReplicatedBlocks has 0 blocks [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO http.HttpServer : Version Jetty/5.1.4 [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO util.Container : Started HttpContext[/static,/static] [sf-startdaemon-debug] 08/10/27 17:15:11 [Thread-6] INFO util.Credential : Checking Resource aliases [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO util.Container : Started org.mortbay.jetty.servlet.WebApplicationHandler@813dc8f [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO util.Container : Started WebApplicationContext[/,/] [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO http.SocketListener : Started SocketListener on 127.0.0.1:8021 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO util.Container : Started org.mortbay.jetty.Server@812883d [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO namenode.FSNamesystem : Web-server up at: localhost:8021 [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server Responder] INFO ipc.Server : IPC Server Responder: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server listener on 8020] INFO ipc.Server : IPC Server listener on 8020: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 0 on 8020] INFO ipc.Server : IPC Server handler 0 on 8020: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 1 on 8020] INFO ipc.Server : IPC Server handler 1 on 8020: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 2 on 8020] INFO ipc.Server : IPC Server handler 2 on 8020: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 3 on 8020] INFO ipc.Server : IPC Server handler 3 on 8020: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO common.Storage : Storage directory /tmp/tempdir10740 is not formatted. [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO common.Storage : Formatting ... [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : Registered FSDatasetStatusMBean [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : Opened info server at 51267 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : Balancing bandwith is 1048576 bytes/s [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-6] INFO namenode.NameNode : State change: NameNode is now LIVE [sf-startdaemon-debug] 2008/10/27 17:15:12:251 GMT [INFO ][Thread-6] HOST morzine.hpl.hp.com:rootProcess:testLocalDataNode - NameNode deployment complete: service is: NameNode instance org.apache.hadoop.hdfs.server.namenode.ExtNameNode@831d1be in state LIVE [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO http.HttpServer : Version Jetty/5.1.4 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO util.Container : Started HttpContext[/static,/static] [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO util.Container : Started org.mortbay.jetty.servlet.WebApplicationHandler@84eff1c [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO util.Container : Started WebApplicationContext[/,/] [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO http.SocketListener : Started SocketListener on 127.0.0.1:60049 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO util.Container : Started org.mortbay.jetty.Server@84ef3f7 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO jvm.JvmMetrics : Cannot initialize JVM Metrics with processName=DataNode, sessionId=null - already initialized [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO metrics.RpcMetrics : Initializing RPC Metrics with hostName=ExtDataNode, port=56249 [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server Responder] INFO ipc.Server : IPC Server Responder: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server listener on 56249] INFO ipc.Server : IPC Server listener on 56249: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 0 on 56249] INFO ipc.Server : IPC Server handler 0 on 56249: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 1 on 56249] INFO ipc.Server : IPC Server handler 1 on 56249: starting [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : dnRegistration = DatanodeRegistration(morzine.hpl.hp.com:51267, storageID=, infoPort=60049, ipcPort=56249) [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 0 on 8020] INFO hdfs.StateChange : BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:51267 storage DS-145538088-127.0.1.1-51267-1225127712347 [sf-startdaemon-debug] 08/10/27 17:15:12 [IPC Server handler 0 on 8020] INFO net.NetworkTopology : Adding a new node: /default-rack/127.0.0.1:51267 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : New storage id DS-145538088-127.0.1.1-51267-1225127712347 is assigned to data-node 127.0.0.1:51267 [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-7] INFO datanode.DataNode : State change: DataNode is now LIVE [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-45] INFO datanode.DataNode : DatanodeRegistration(127.0.0.1:51267, storageID=DS-145538088-127.0.1.1-51267-1225127712347, infoPort=60049, ipcPort=56249)In DataNode.run, data = FSDataset{dirpath='/tmp/tempdir10740/current'} [sf-startdaemon-debug] 2008/10/27 17:15:12:418 GMT [INFO ][Thread-7] HOST morzine.hpl.hp.com:rootProcess:testLocalDataNode - DataNode deployment complete: service is: DataNode {data=FSDataset{dirpath='/tmp/tempdir10740/current'}, localName='127.0.0.1:51267', storageID='DS-145538088-127.0.1.1-51267-1225127712347', xmitsInProgress=0, state=LIVE} [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-45] INFO datanode.DataNode : using BLOCKREPORT_INTERVAL of 10000msec Initial delay: 0msec [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 17 msecs [sf-startdaemon-debug] 08/10/27 17:15:12 [Thread-45] INFO datanode.DataNode : Starting Periodic block scanner. [sf-startdaemon-debug] 2008/10/27 17:15:12:690 GMT [INFO ][TerminatorThread] HOST morzine.hpl.hp.com:rootProcess:testLocalDataNode - Checking host:port localhost/127.0.0.1:8021 is open [sf-startdaemon-debug] 2008/10/27 17:15:12:691 GMT [INFO ][TerminatorThread] HOST morzine.hpl.hp.com:rootProcess:testLocalDataNode - Checking host:port /0.0.0.0:50010 is open [sf-startdaemon-debug] 08/10/27 17:15:19 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 0 msecs [sf-startdaemon-debug] 08/10/27 17:15:29 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 0 msecs [sf-system-test-junit] 08/10/27 17:15:32 [Main Thread] WARN test.SmartFrogTestBase : Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-startdaemon-debug] 2008/10/27 17:15:34:966 GMT [INFO ][RMI TCP Connection(3)-127.0.1.1] HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly - Checking host:port /127.0.0.1:8020 is open [sf-startdaemon-debug] 08/10/27 17:15:39 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 0 msecs [sf-startdaemon-debug] 2008/10/27 17:15:45:237 GMT [INFO ][TerminatorThread] HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly - Exit type mismatch -expected normal but got abnormal [sf-startdaemon-debug] 2008/10/27 17:15:45:240 GMT [ERROR][TerminatorThread] HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly - Expected action to terminate with the status normal [sf-startdaemon-debug] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-startdaemon-debug] 2008/10/27 17:15:45:240 GMT [ERROR][TerminatorThread] HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly - Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Expected action to terminate with the status normal [sf-startdaemon-debug] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-startdaemon-debug] 08/10/27 17:15:49 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 0 msecs [sf-startdaemon-debug] 08/10/27 17:15:59 [Thread-45] INFO datanode.DataNode : BlockReport of 0 blocks got processed in 0 msecs [sf-system-test-junit] 08/10/27 17:16:05 [Main Thread] WARN test.SmartFrogTestBase : Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-system-test-junit] (unknown) -TestCompletedEvent at Mon Oct 27 17:15:45 GMT 2008 alive: true [sf-system-test-junit] status: [sf-system-test-junit] Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Expected action to terminate with the status normal [sf-system-test-junit] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-system-test-junit] check that the cluster shut down [sf-system-test-junit] succeeded:false [sf-system-test-junit] forcedTimeout:false [sf-system-test-junit] skipped:false [sf-system-test-junit] ) [sf-system-test-junit] Tests run: 2, Failures: 1, Errors: 1, Time elapsed: 58.972 sec [sf-system-test-junit] ------------- Standard Error ----------------- [sf-system-test-junit] 08/10/27 17:15:32 [Main Thread] WARN test.SmartFrogTestBase : Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-system-test-junit] 08/10/27 17:16:05 [Main Thread] WARN test.SmartFrogTestBase : Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-system-test-junit] ------------- ---------------- --------------- [sf-system-test-junit] Testcase: testLocalDataNode took 26.362 sec [sf-system-test-junit] FAILED [sf-system-test-junit] Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-system-test-junit] junit.framework.AssertionFailedError: Expected port localhost:8020 to be closed, but it is still open after 20000 milliseconds [sf-system-test-junit] at org.smartfrog.services.hadoop.test.system.local.namenode.HadoopTestBase.blockUntilPortIsClosed(HadoopTestBase.java:77) [sf-system-test-junit] at org.smartfrog.services.hadoop.test.system.local.namenode.HadoopTestBase.tearDown(HadoopTestBase.java:59) [sf-system-test-junit] Testcase: testClusterShutDownCleanly took 32.448 sec [sf-system-test-junit] Caused an ERROR [sf-system-test-junit] Test failed [sf-system-test-junit] (unknown) -TestCompletedEvent at Mon Oct 27 17:15:45 GMT 2008 alive: true [sf-system-test-junit] status: [sf-system-test-junit] Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Expected action to terminate with the status normal [sf-system-test-junit] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-system-test-junit] check that the cluster shut down [sf-system-test-junit] succeeded:false [sf-system-test-junit] forcedTimeout:false [sf-system-test-junit] skipped:false [sf-system-test-junit] org.smartfrog.test.TerminationRecordException: Test failed [sf-system-test-junit] (unknown) -TestCompletedEvent at Mon Oct 27 17:15:45 GMT 2008 alive: true [sf-system-test-junit] status: [sf-system-test-junit] Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Expected action to terminate with the status normal [sf-system-test-junit] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-system-test-junit] check that the cluster shut down [sf-system-test-junit] succeeded:false [sf-system-test-junit] forcedTimeout:false [sf-system-test-junit] skipped:false [sf-system-test-junit] Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Expected action to terminate with the status normal [sf-system-test-junit] But got Termination Record: HOST morzine.hpl.hp.com:rootProcess:testClusterShutDownCleanly:action:else, type: abnormal, description: Timed out [sf-system-test-junit] at org.smartfrog.test.DeployingTestBase.completeTestDeployment(DeployingTestBase.java:316) [sf-system-test-junit] at org.smartfrog.test.DeployingTestBase.runTestsToCompletion(DeployingTestBase.java:339) [sf-system-test-junit] at org.smartfrog.test.DeployingTestBase.expectSuccessfulTestRunOrSkip(DeployingTestBase.java:440) [sf-system-test-junit] at org.smartfrog.services.hadoop.test.system.local.datanode.DataNodeTest.testClusterShutDownCleanly(DataNodeTest.java:44) [sf-system-test-junit] Test org.smartfrog.services.hadoop.test.system.local.datanode.DataNodeTest FAILED [junitreport] the file /home/slo/Projects/SmartFrog/Forge/core/components/hadoop/build/test/data/TEST-org.smartfrog.services.hadoop.test.system.local.tracker.JobTrackerTest.xml is empty. [junitreport] This can be caused by the test JVM exiting unexpectedly [junitreport] Processing /home/slo/Projects/SmartFrog/Forge/core/components/hadoop/build/test/data/TESTS-TestSuites.xml to /tmp/null1460085882 [junitreport] Loading stylesheet jar:file:/home/slo/Java/Apache/ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl [sf-stopdaemon] SmartFrog 3.12.045dev (2008-10-22 15:24:08 BST) [sf-stopdaemon] (C) Copyright 1998-2008 Hewlett-Packard Development Company, LP [sf-stopdaemon] 2008/10/27 17:16:07:091 GMT [WARN ][Main Thread] SFCORE_LOG - SmartFrog security is NOT active [sf-startdaemon-debug] SmartFrog [rootProcess] dead Mon Oct 27 17:16:07 GMT 2008 [sf-stopdaemon] SmartFrog daemon terminated [sf-stopdaemon] - Successfully terminated: 'null', host:localhost [junitreport] Transform time: 2811ms [junitreport] Deleting: /tmp/null1460085882 {code} > Some of the cluster tests aren't shutting down the hdfs > ------------------------------------------------------- > > Key: SFOS-1015 > URL: http://jira.smartfrog.org/jira/browse/SFOS-1015 > Project: SmartFrog > Issue Type: Bug > Components: _service_hadoop > Affects Versions: 3.13.x > Reporter: Steve Loughran > Assignee: Steve Loughran > Fix For: 3.13.x > > > Not all the clusters are shutting down. We have tests that catch this, but the problem is: why? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.smartfrog.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |