Menu

CDH4 No FileSystem for scheme: hdfs

Help
Liam
2012-09-25
2012-12-07
  • Liam

    Liam - 2012-09-25

    I've been trying to get pydoop 0.6.4 to work with CDH4.  After fixing the setup.py I was able to get it to compile and install cleanly however I am unable to get hdfs to connect correctly.  Running on Centos 6.3.

    -bash-4.1$ python
    Python 2.7.3 (default, Sep 20 2012, 22:44:26)
    on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pydoop.hdfs as pyhdfs
    >>> fs = pyhdfs.fs.hdfs()
    12/09/25 11:35:30 ERROR security.UserGroupInformation: PriviledgedActionException as:prdps (auth:SIMPLE) cause:java.io.IOException: No FileSystem for scheme: hdfs
    Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2138)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2145)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302)
    at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:148)
    at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:146)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:146)
    Call to org.apache.hadoop.fs.Filesystem::get(URI, Configuration) failed!
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/py273/lib/python2.7/site-packages/pydoop/hdfs/fs.py", line 119, in __init__
        h, p, u, fs = _get_connection_info(host, port, user)
      File "/usr/local/py273/lib/python2.7/site-packages/pydoop/hdfs/fs.py", line 59, in _get_connection_info
        fs = hdfs_ext.hdfs_fs(host, port, user)
    IOError: Cannot connect to default
    >>>

    I get the same error when I put in a hostname/port.  However the command line hadoop commands work fine…

    -bash-4.1$ hadoop fs -ls hdfs://myhost/
    Found 2 items
    drwxr-xr-x   - hdfs supergroup          0 2012-09-21 11:30 hdfs://myhost/system
    drwxrwxrwt   - hdfs supergroup          0 2012-06-07 17:56 hdfs://myhost/tmp
    -bash-4.1$

    Ideas?

     
  • Liam

    Liam - 2012-09-25

    I've figured this out.  Turns out when pydoop was setting up its CLASSPATH correctly, it was not including hadoop-hdfs-2.0.0-cdh4.0.1.jar which isn't in /usr/lib/hadoop.  So basically the fix was to symlink /usr/lib/hadoop/client/hadoop-hdfs-2.0.0-cdh4.0.1.jar into /usr/lib/hadoop which fixed the problem.

     
  • Liam

    Liam - 2012-09-25

    Err, I should read what I type before hitting send.  pydoop was *not* setting up its CLASSPATH correctly…  :)

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.