From: Ruhua J. <ruh...@gm...> - 2015-01-20 17:14:56
|
Hello I am trying to run Hadoop (1.2.1) on top of a HPC infrastructure using myHadoop(0.30). The HPC uses SLURM. First I tried the word counting example using non-persist mode, here is the script, I did some modification based on example code of Dr. Lockwood. The result seems good. However, when I try to run the persist mode, there are some problems. We are using GPFS. Here is the script: #!/bin/bash ################################################################################ # slurm.sbatch - A sample submit script for SLURM that illustrates how to # spin up a Hadoop cluster for a map/reduce task using myHadoop # # Glenn K. Lockwood, San Diego Supercomputer Center February 2014 ################################################################################ #SBATCH -p Westmere #SBATCH -n 4 #SBATCH --ntasks-per-node=1 #SBATCH -t 1:00:00 ### If these aren't already in your environment (e.g., .bashrc), you must define ### them. We assume hadoop and myHadoop were installed in $HOME/hadoop-stack export HADOOP_HOME=$HOME/hadoop-stack/hadoop-1.2.1 export PATH=$HADOOP_HOME/bin:$HOME/hadoop-stack/myhadoop-0.30/bin:$PATH:$PATH export JAVA_HOME=/usr export HADOOP_CONF_DIR=$HOME/hadoop/conf/hadoop-conf.$SLURM_JOBID export MH_SCRATCH_DIR=/tmp/$USER/$SLURM_JOBID export MH_PERSIST_DIR=$HOME/hadoop/hdfs myhadoop-configure.sh -s $MH_SCRATCH_DIR -p $MH_PERSIST_DIR if [ ! -f ./pg2701.txt ]; then echo "*** Retrieving some sample input data" wget 'http://www.gutenberg.org/cache/epub/2701/pg2701.txt' fi ##$HADOOP_HOME/bin/start-all.sh $HADOOP_HOME/bin/hadoop namenode $HADOOP_HOME/bin/hadoop datanode $HADOOP_HOME/bin/hadoop dfs -mkdir data $HADOOP_HOME/bin/hadoop dfs -put ./pg2701.txt data/ $HADOOP_HOME/bin/hadoop dfs -ls data $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-examples-*.jar wordcount data wordcount-output $HADOOP_HOME/bin/hadoop dfs -ls wordcount-output $HADOOP_HOME/bin/hadoop dfs -get wordcount-output ./ $HADOOP_HOME/bin/stop-all.sh myhadoop-cleanup.sh Here is the log: === myHadoop: Using HADOOP_HOME=/home/hpc-ruhua/hadoop-stack/hadoop-1.2.1 myHadoop: Using MH_SCRATCH_DIR=/tmp/hpc-ruhua/4128 myHadoop: Using JAVA_HOME=/usr myHadoop: Generating Hadoop configuration in directory in /home/hpc-ruhua/hadoop/conf/hadoop-conf.4128... myHadoop: Using directory /home/hpc-ruhua/hadoop/hdfs for persisting HDFS state... myHadoop: Designating cn53 as master node (namenode, secondary namenode, and jobtracker) myHadoop: The following nodes will be slaves (datanode, tasktracer): cn53 cn54 cn55 cn56 Linking /home/hpc-ruhua/hadoop/hdfs/0 to /tmp/hpc-ruhua/4128/hdfs_data on cn53 Linking /home/hpc-ruhua/hadoop/hdfs/1 to /tmp/hpc-ruhua/4128/hdfs_data on cn54 Linking /home/hpc-ruhua/hadoop/hdfs/2 to /tmp/hpc-ruhua/4128/hdfs_data on cn55 Warning: Permanently added 'cn55,192.168.100.55' (RSA) to the list of known hosts. Linking /home/hpc-ruhua/hadoop/hdfs/3 to /tmp/hpc-ruhua/4128/hdfs_data on cn56 Warning: Permanently added 'cn56,192.168.100.56' (RSA) to the list of known hosts. starting namenode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-namenode-cn53.out cn53: starting datanode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn53.out cn54: starting datanode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn54.out cn55: starting datanode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn55.out cn56: starting datanode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn56.out cn53: starting secondarynamenode, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-secondarynamenode-cn53.out starting jobtracker, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-jobtracker-cn53.out cn53: starting tasktracker, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn53.out cn56: starting tasktracker, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn56.out cn55: starting tasktracker, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn55.out cn54: starting tasktracker, logging to /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn54.out mkdir: cannot create directory data: File exists put: Target data/pg2701.txt already exists Found 1 items -rw-r--r-- 3 hpc-ruhua supergroup 0 2015-01-07 00:09 /user/hpc-ruhua/data/pg2701.txt 15/01/14 12:21:08 ERROR security.UserGroupInformation: PriviledgedActionException as:hpc-ruhua cause:org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker is not yet RUNNING at org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199) at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426) org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker is not yet RUNNING at org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199) at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426) at org.apache.hadoop.ipc.Client.call(Client.java:1113) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) ls: Cannot access wordcount-output: No such file or directory. get: null stopping jobtracker cn54: stopping tasktracker cn55: stopping tasktracker cn53: stopping tasktracker cn56: stopping tasktracker stopping namenode cn53: no datanode to stop cn54: no datanode to stop cn56: no datanode to stop cn55: no datanode to stop === The error is “ERROR security.UserGroupInformation: PriviledgedActionException as:hpc-ruhua cause:org.apache.hadoop.ipc.RemoteException:” it says JobTracker is not yet running. Any idea about that? Thanks Best, Ruhua Jiang Graduate Student at University of Connecticut HORNET Cluster Technical Support |