From: <bi...@us...> - 2010-01-12 22:17:50
|
Revision: 2943 http://archive-access.svn.sourceforge.net/archive-access/?rev=2943&view=rev Author: binzino Date: 2010-01-12 22:17:44 +0000 (Tue, 12 Jan 2010) Log Message: ----------- WAX-69. Comment out code that writes crawl_data. Modified Paths: -------------- trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java Modified: trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java =================================================================== --- trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java 2010-01-11 21:46:57 UTC (rev 2942) +++ trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java 2010-01-12 22:17:44 UTC (rev 2943) @@ -467,7 +467,14 @@ try { - output.collect( key, new NutchWritable( datum ) ); + // Some weird problem with Hadoop 0.19.x - when the crawl_data + // is merged during the reduce step, the classloader cannot + // find the org.apache.nutch.protocol.ProtocolStatus class. + // + // We avoid the whole issue by omitting the crawl_data all + // together, which we don't use anyways. + // + // output.collect( key, new NutchWritable( datum ) ); if ( jobConf.getBoolean( "nutchwax.import.store.content", false ) ) { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |