Lowercase example failing

Help
2012-12-04
2012-12-07
  • Hi,

    I am having problems executing the Lowercase example in my cluster. First of all, I want to say that the wordcount example provided with the cloudera distribution is working fine, and that the code of the lowercase.py script I am using is the one in pydoop documentation.

    I've written a text file with some sentences, copied to the HDFS and called:

    pydoop script --num-reducers 0 -t '' lower.py /user/vagrant/examples/lower /user/vagrant/examples/lower_out
    

    And the result is that, after a couple of minutes, it fails with this error:

    (master)vagrant@namenode:~$ pydoop script --num-reducers 0 -t '' lower.py /user/vagrant/examples/lower /user/vagrant/examples/lower_out
    Traceback (most recent call last):
      File "/home/vagrant/.virtualenvs/master/bin/pydoop", line 30, in <module>
        main(sys.argv[1:])
      File "/home/vagrant/.virtualenvs/master/local/lib/python2.7/site-packages/pydoop/app/main.py", line 46, in main
        args.func(args)
      File "/home/vagrant/.virtualenvs/master/local/lib/python2.7/site-packages/pydoop/app/script.py", line 311, in run
        script.run()
      File "/home/vagrant/.virtualenvs/master/local/lib/python2.7/site-packages/pydoop/app/script.py", line 301, in run
        more_args=pipes_args, properties=self.properties, logger=self.logger
      File "/home/vagrant/.virtualenvs/master/local/lib/python2.7/site-packages/pydoop/hadut.py", line 309, in run_pipes
        logger=logger)
      File "/home/vagrant/.virtualenvs/master/local/lib/python2.7/site-packages/pydoop/hadut.py", line 127, in run_cmd
        raise RuntimeError(error)
    RuntimeError: DEPRECATED: Use of this script to execute mapred command is deprecated.
    Instead use the mapred command for it.
    12/12/04 09:54:42 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    12/12/04 09:54:42 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
    12/12/04 09:54:42 INFO mapred.FileInputFormat: Total input paths to process : 1
    12/12/04 09:54:49 INFO mapred.JobClient: Running job: job_201212040948_0005
    12/12/04 09:54:50 INFO mapred.JobClient:  map 0% reduce 0%
    12/12/04 10:05:30 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000001_0, Status : FAILED
    Task attempt_201212040948_0005_m_000001_0 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000001_0: /data/2/mapred/local/taskTracker/distcache/-6369693126454223979_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000001_0: /data/2/mapred/local/taskTracker/distcache/-6369693126454223979_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:05:34 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000000_0, Status : FAILED
    Task attempt_201212040948_0005_m_000000_0 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000000_0: /tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-1279783825450710633_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000000_0: /tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-1279783825450710633_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:15:49 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000001_1, Status : FAILED
    Task attempt_201212040948_0005_m_000001_1 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000001_1: /data/2/mapred/local/taskTracker/distcache/1299361856892079245_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000001_1: /data/2/mapred/local/taskTracker/distcache/1299361856892079245_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:15:51 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000000_1, Status : FAILED
    Task attempt_201212040948_0005_m_000000_1 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000000_1: /data/2/mapred/local/taskTracker/distcache/2241308001851859069_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000000_1: /data/2/mapred/local/taskTracker/distcache/2241308001851859069_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:26:06 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000000_2, Status : FAILED
    Task attempt_201212040948_0005_m_000000_2 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000000_2: /data/2/mapred/local/taskTracker/distcache/1299361856892079245_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000000_2: /data/2/mapred/local/taskTracker/distcache/1299361856892079245_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:26:19 INFO mapred.JobClient: Task Id : attempt_201212040948_0005_m_000001_2, Status : FAILED
    Task attempt_201212040948_0005_m_000001_2 failed to report status for 600 seconds. Killing!
    attempt_201212040948_0005_m_000001_2: /tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-1279783825450710633_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: /home/vagrant/.virtualenvs/master/bin/python: Permission denied
    attempt_201212040948_0005_m_000001_2: /tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-1279783825450710633_638392187_1700177199/namenode/user/vagrant/examples/pydoop_script_95f1687ec2594290855d508029d83276/exeef84dfe5ac2345a288fcba003ba59882: line 5: exec: /home/vagrant/.virtualenvs/master/bin/python: cannot execute: Permission denied
    12/12/04 10:36:29 INFO mapred.JobClient: Job complete: job_201212040948_0005
    12/12/04 10:36:30 INFO mapred.JobClient: Counters: 8
    12/12/04 10:36:30 INFO mapred.JobClient:   Job Counters 
    12/12/04 10:36:30 INFO mapred.JobClient:     Failed map tasks=1
    12/12/04 10:36:30 INFO mapred.JobClient:     Launched map tasks=8
    12/12/04 10:36:30 INFO mapred.JobClient:     Data-local map tasks=4
    12/12/04 10:36:30 INFO mapred.JobClient:     Rack-local map tasks=4
    12/12/04 10:36:30 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=674253
    12/12/04 10:36:30 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
    12/12/04 10:36:30 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
    12/12/04 10:36:30 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
    12/12/04 10:36:30 INFO mapred.JobClient: Job Failed: NA
    Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1323)
        at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248)
        at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479)
        at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494)
    

    I am running python within a virtual environment called "master" (as you can see in the command line prompts), in where I installed pydoop, I checked the permissions and they look right:

    (master)vagrant@namenode:~/.virtualenvs/master/bin$ ls -l
    total 3340
    ...
    -rwxrwxr-x 1 vagrant vagrant 2989480 Nov 28 10:00 python
    lrwxrwxrwx 1 vagrant vagrant       6 Nov 28 10:00 python2 -> python
    lrwxrwxrwx 1 vagrant vagrant       6 Nov 28 10:00 python2.7 -> python
    ...
    

    So I really don't know what is happening here. If it is useful, I gave you here and here the last log traces of my namenode and one of the datanodes (respectively). If I can provide some more information please tell me and I will.

    Thank you very much for your help!

     
  • Simone Leo
    Simone Leo
    2012-12-04

    Hello,

    Have you checked that all components of the /home/vagrant/.virtualenvs/master/bin/python path are accessible by the mapred user? Having execute permissions on the python executable is of no use if you can't reach it.

     
  • Hi!

    Ehm… I feel so silly… that was the problem, mapred user had permissions until /home/vagrant/.virtualenvs/, but not further…

    Thank you very much, and I'm sorry for asking this.