From: stack <st...@ar...> - 2005-08-18 17:59:50
|
Lukáš Matějka wrote: > Hi, >does anybody have an idea? > > What is your complete indexarcs.sh line? Looks like we're passing in a '*' character -- i.e. ./nutch-data/segments/*/fetcher/data -- and internally is not expanding the glob character. Try something simple w/o '*' characters for your '-d' value. St.Ack > xmatejk2@war:~/nutchwax-0.2.1$ ./bin/indexarcs.sh -s /home... >Tue Aug 9 13:52:36 CEST 2005 Checking environment variables. > > >>Tue Aug 9 13:52:36 CEST 2005 Cleaning up all ./nutch-data/ content. >>Tue Aug 9 13:52:36 CEST 2005 Creating new queue, and segments. >>Tue Aug 9 13:52:36 CEST 2005 Started segmenting. >>ERROR: ./nutch-data//queue/ directory does not exist. >>/home/xmatejk2/nutchwax-0.2.1/bin/arcs2segs.sh DIR_OF_ARCS DIR_FOR_SEGMENTS [#ARCS] >>Tue Aug 9 13:52:36 CEST 2005 Started build of link database. >>050809 135236 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135236 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135236 No FS indicated, using default:local >>050809 135236 Created webdb at LocalFS,./nutch-data/db >>050809 135237 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135237 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135237 No FS indicated, using default:local >>050809 135237 Updating ./nutch-data/db >>050809 135237 Updating for ./nutch-data//segments/* >>Exception in thread "main" java.io.FileNotFoundException: ./nutch-data/segments/*/fetcher/data >>at org.apache.nutch.fs.LocalFileSystem.open(LocalFileSystem.java:93) >>at org.apache.nutch.io.SequenceFile$Reader.<init>(SequenceFile.java:194) >> at org.apache.nutch.io.SequenceFile$Reader.<init>(SequenceFile.java:187) >> at org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:190) >> at org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:179) >> at org.apache.nutch.io.ArrayFile$Reader.<init>(ArrayFile.java:50) >>at org.apache.nutch.tools.UpdateDatabaseTool.updateForSegment(UpdateDatabaseTool.java:92) >>at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:366) >>050809 135238 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135238 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135238 Updating ./nutch-data//segments from ./nutch-data//db >>Exception in thread "main" java.lang.NullPointerException >>at org.apache.nutch.tools.UpdateSegmentsFromDb.run(UpdateSegmentsFromDb.java:181) >>at org.apache.nutch.tools.UpdateSegmentsFromDb.main(UpdateSegmentsFromDb.java:345) >>Tue Aug 9 13:52:38 CEST 2005 Started indexing. >>050809 135239 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135239 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135239 No FS indicated, using default:local >>050809 135239 indexing segment: ./nutch-data/segments/* >>050809 135239 * Opening segment * >>Exception in thread "main" java.lang.NullPointerException >>at org.apache.nutch.indexer.IndexSegment.indexPages(IndexSegment.java:165) >>at org.apache.nutch.indexer.IndexSegment.main(IndexSegment.java:263) >>Tue Aug 9 13:52:39 CEST 2005 Started dedup. >>050809 135239 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135239 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135239 No FS indicated, using default:local >>050809 135240 Reading url hashes... >>050809 135240 Sorting url hashes... >>050809 135240 Deleting url duplicates... >>050809 135240 Deleted 0 url duplicates. >>050809 135240 Reading content hashes... >>050809 135240 Sorting content hashes... >>050809 135240 Deleting content duplicates... >>050809 135240 Deleted 0 content duplicates. >>050809 135240 Duplicate deletion complete locally. Now returning to NFS... >>050809 135240 DeleteDuplicates complete >>Tue Aug 9 13:52:40 CEST 2005 Merging indices. >>050809 135240 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-default.xml >>050809 135240 parsing file:/home/xmatejk2/nutchwax-0.2.1/conf/nutch-site.xml >>050809 135240 No FS indicated, using default:local >>050809 135240 merging segment indexes to: ./nutch-data/index >>050809 135240 done merging >> >>-lm >> >> >> >> >> > > > >------------------------------------------------------- >SF.Net email is Sponsored by the Better Software Conference & EXPO >September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices >Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA >Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf >_______________________________________________ >Archive-access-cvs mailing list >Arc...@li... >https://lists.sourceforge.net/lists/listinfo/archive-access-cvs > > |