|
From: Michael S. <st...@ar...> - 2006-06-14 15:50:45
|
Tell us more Natalia. It looks like you are doing the right thing but
the below exception is odd: We're not finding items on CLASSPATH. What
hadoop are you using? I just tried nutchwax-0.6.1 locally and didn't
get the below. Have you made any config. in hadoop or is it set to all
defaults? What operating system and what JVM?
St.Ack
Natalia Torres wrote:
> Hi
> this is my first experience with NutchWax after crawl with heritrix.
>
> I've installed all software (hadoop, nutch ...) to run nutchwax+wera
> whith jobs crawled.
>
> When I run all of the indexing steps in one go by passing the 'all'
> directive to NutchWAX using this command
>
> % ${HADOOP_HOME}/bin/hadoop jar ${NUTCHWAX_HOME}/nutchwax-0.6.1.jar
> all /tmp/inputs /tmp/outputs test
>
> I get this error
>
> java.lang.RuntimeException: org.apache.nutch.net.URLFilter not found.
> at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:47)
> at
> org.apache.nutch.parse.ParseOutputFormat.getRecordWriter(ParseOutputFormat.java:47)
> at
> org.apache.nutch.fetcher.FetcherOutputFormat$1.<init>(FetcherOutputFormat.java:69)
> at
> org.apache.nutch.fetcher.FetcherOutputFormat.getRecordWriter(FetcherOutputFormat.java:58)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:265)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:124)
>
> Can anyone tellme which is the problem?
>
> Thanks
>
> Natalia
>
>
> _______________________________________________
> Archive-access-discuss mailing list
> Arc...@li...
> https://lists.sourceforge.net/lists/listinfo/archive-access-discuss
>
|