|
From: roger p <ro...@ho...> - 2006-08-27 23:16:20
|
Can anyone tell me howto overcome the following problem.
I use:
Fedora Core 5
Sun Java 1.5
and I have followed the instructions throughout the manual, but still I get
an error retrieving the records in the arc files. The following error
occurs:
:WARN: /nutchwax/opensearch:
java.lang.RuntimeException: java.lang.NullPointerException
at
org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:204)
at
org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:346)
at
org.archive.access.nutch.NutchwaxBean.getSummary(NutchwaxBean.java:53)
at
org.apache.nutch.searcher.OpenSearchServlet.doGet(OpenSearchServlet.java:155)
at
org.archive.access.nutch.NutchwaxOpenSearchServlet.doGet(NutchwaxOpenSearchServlet.java:69)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:442)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:357)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:226)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:615)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:150)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:123)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:141)
at org.mortbay.jetty.Server.handle(Server.java:272)
at
org.mortbay.jetty.HttpConnection.handlerRequest(HttpConnection.java:404)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:650)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:488)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:198)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:319)
at
org.mortbay.jetty.nio.HttpChannelEndPoint.run(HttpChannelEndPoint.java:270)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
Before this it has found 14 hits of the searched word:
2006-08-27 17:11:30,038 INFO NutchBean - found 14 raw hits
2006-08-27 17:11:30,039 INFO NutchBean - total hits: 1829
So the problem does not seem to be searching in the indeces. Maybe there
might be a problem with accessing the arcs to get the specific page. I put
the correct path (searcher.dir) inside the nutch-site.xml and
hadoop-site.xml
Has anyone any idea about how to solve this problem?
CK
|