The org.python.proxies package contains the bytecode generated at runtime. Is Hadoop running in the same JVM as your Jython program? If it is not this is probably a problem with instances being serialized but not classes. It is an interesting problem, and one that I think we should work on, but I don't believe we handle this very well (if at all) at the moment.
If Hadoop is running in the same JVM it should work, so then I guess it's some other wierd classloader issue, it should be quite simple to fix though.


On Mon, Sep 29, 2008 at 5:28 PM, John Thompson <> wrote:

I'm just trying to run the basic word count jython program that's distributed with Hadoop.  I've included a slightly simplified version below:

from org.apache.hadoop.fs import Path
from import *
from org.apache.hadoop.mapred import *

import sys
import getopt

class WordCountMap(Mapper, MapReduceBase):
    one = IntWritable(1)
    def map(self, key, value, output, reporter):
        for w in value.toString().split():

class Summer(Reducer, MapReduceBase):
    def reduce(self, key, values, output, reporter):
        sum = 0
        while values.hasNext():
            sum +=
        output.collect(key, IntWritable(sum))

def main(args):
    conf = JobConf(WordCountMap);



if __name__ == "__main__":

When I run "jython" Hadoop apparently has trouble finding my Python classes.  Sorry about the length, but here's the full output:

08/09/29 08:23:22 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
08/09/29 08:23:22 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
08/09/29 08:23:22 INFO mapred.FileInputFormat: Total input paths to process : 1
08/09/29 08:23:23 INFO mapred.JobClient: Running job: job_local_1
08/09/29 08:23:23 INFO mapred.MapTask: numReduceTasks: 1
08/09/29 08:23:23 WARN mapred.LocalJobRunner: job_local_1
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.mapred.JobConf.getMapperClass(
    at org.apache.hadoop.mapred.MapRunner.configure(
    at org.apache.hadoop.util.ReflectionUtils.setConf(
    at org.apache.hadoop.util.ReflectionUtils.newInstance(
    at org.apache.hadoop.mapred.LocalJobRunner$
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.conf.Configuration.getClass(
    ... 6 more
Caused by: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at Method)
    at java.lang.ClassLoader.loadClass(
    at sun.misc.Launcher$AppClassLoader.loadClass(
    at java.lang.ClassLoader.loadClass(
    at java.lang.ClassLoader.loadClassInternal(
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(
    at org.apache.hadoop.conf.Configuration.getClassByName(
    at org.apache.hadoop.conf.Configuration.getClass(
    ... 7 more
Traceback (innermost last):
  File "", line 54, in ?
  File "", line 51, in main
    at org.apache.hadoop.mapred.JobClient.runJob(
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke( Job failed!

Has anyone seen this type of problem before?  And do you know how I can tell Hadoop about the existence of the classes in org.python.proxies.__main__?


This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Jython-users mailing list