|
From: <bi...@us...> - 2010-01-12 22:17:50
|
Revision: 2943
http://archive-access.svn.sourceforge.net/archive-access/?rev=2943&view=rev
Author: binzino
Date: 2010-01-12 22:17:44 +0000 (Tue, 12 Jan 2010)
Log Message:
-----------
WAX-69. Comment out code that writes crawl_data.
Modified Paths:
--------------
trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java
Modified: trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java
===================================================================
--- trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java 2010-01-11 21:46:57 UTC (rev 2942)
+++ trunk/archive-access/projects/nutchwax/archive/src/java/org/archive/nutchwax/Importer.java 2010-01-12 22:17:44 UTC (rev 2943)
@@ -467,7 +467,14 @@
try
{
- output.collect( key, new NutchWritable( datum ) );
+ // Some weird problem with Hadoop 0.19.x - when the crawl_data
+ // is merged during the reduce step, the classloader cannot
+ // find the org.apache.nutch.protocol.ProtocolStatus class.
+ //
+ // We avoid the whole issue by omitting the crawl_data all
+ // together, which we don't use anyways.
+ //
+ // output.collect( key, new NutchWritable( datum ) );
if ( jobConf.getBoolean( "nutchwax.import.store.content", false ) )
{
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|