The org.python.proxies package contains the bytecode generated at runtime. Is Hadoop running in the same JVM as your Jython program? If it is not this is probably a problem with instances being serialized but not classes. It is an interesting problem, and one that I think we should work on, but I don't believe we handle this very well (if at all) at the moment.
If Hadoop is running in the same JVM it should work, so then I guess it's some other wierd classloader issue, it should be quite simple to fix though.


I'm just trying to run the basic word count jython program that's distributed with Hadoop.  I've included a slightly simplified version below:

from org.apache.hadoop.fs import Path
from import *
from org.apache.hadoop.mapred import *

import sys
import getopt

class WordCountMap(Mapper, MapReduceBase):
    one = IntWritable(1)
    def map(self, key, value, output, reporter):
        for w in value.toString().split():

class Summer(Reducer, MapReduceBase):
    def reduce(self, key, values, output, reporter):
        sum = 0
        while values.hasNext():
            sum +=
        output.collect(key, IntWritable(sum))

def main(args):
    conf = JobConf(WordCountMap);



if __name__ == "__main__":

When I run "jython" Hadoop apparently has trouble finding my Python classes.  Sorry about the length, but here's the full output:

08/09/29 08:23:22 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
08/09/29 08:23:22 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
08/09/29 08:23:22 INFO mapred.FileInputFormat: Total input paths to process : 1
08/09/29 08:23:23 INFO mapred.JobClient: Running job: job_local_1
08/09/29 08:23:23 INFO mapred.MapTask: numReduceTasks: 1
08/09/29 08:23:23 WARN mapred.LocalJobRunner: job_local_1
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.mapred.JobConf.getMapperClass(
    at org.apache.hadoop.mapred.MapRunner.configure(
    at org.apache.hadoop.util.ReflectionUtils.setConf(
    at org.apache.hadoop.util.ReflectionUtils.newInstance(
    at org.apache.hadoop.mapred.LocalJobRunner$
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.conf.Configuration.getClass(
    ... 6 more
Caused by: java.lang.ClassNotFoundException: org.python.proxies.__main__$WordCountMap$0
    at Method)
    at java.lang.ClassLoader.loadClass(
    at sun.misc.Launcher$AppClassLoader.loadClass(
    at java.lang.ClassLoader.loadClass(
    at java.lang.ClassLoader.loadClassInternal(
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(
    at org.apache.hadoop.conf.Configuration.getClassByName(
    at org.apache.hadoop.conf.Configuration.getClass(
    ... 7 more
Traceback (innermost last):
  File "", line 54, in ?
  File "", line 51, in main
    at org.apache.hadoop.mapred.JobClient.runJob(
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke( Job failed!

Has anyone seen this type of problem before?  And do you know how I can tell Hadoop about the existence of the classes in org.python.proxies.__main__?


