From: stack <st...@ar...> - 2005-09-14 21:52:42
|
Lukas Matejka wrote: >i downloaded new version of nutch from cvs and i think that script >indexarc.sh stil doesn't work well. > >(in previous version i had to use absolute paths and no links in directories) > > Links should be fine. Works for me. >with relative paths same result... > >in dir archive are slinks to arcs. > > > The below looks like its not finding any arcs in /home/nwa/nutchwax/archive. Are there files with a '.arc.gz' ending in /home/nwa/nutchwax/archive? We're just skipping through the segmenting step w/o indexing anything. We then get to the update from db step but no segments were created at the indexing stage. St.Ack >./bin/indexarcs.sh -s /home/nwa/nutchwax/archive -d /home/nwa/nutchwax/data -c >test >St zář 14 23:12:36 CEST 2005 Checking environment variables. >St zář 14 23:12:36 CEST 2005 Cleaning up all /home/nwa/nutchwax/data content. >St zář 14 23:12:36 CEST 2005 Creating new queue, and segments. >St zář 14 23:12:36 CEST 2005 Started segmenting. >St zář 14 23:12:36 CEST 2005 Started build of link database. >050914 231237 parsing file:/home/nwa/nutchwax/conf/nutch-default.xml >050914 231238 parsing file:/home/nwa/nutchwax/conf/nutch-site.xml >050914 231238 No FS indicated, using default:local >050914 231238 Created webdb at LocalFS,/home/nwa/nutchwax/data/db >050914 231239 parsing file:/home/nwa/nutchwax/conf/nutch-default.xml >050914 231240 parsing file:/home/nwa/nutchwax/conf/nutch-site.xml >050914 231240 No FS indicated, using default:local >050914 231240 Updating /home/nwa/nutchwax/data/db >050914 231240 Updating for /home/nwa/nutchwax/data/segments/* >Exception in thread "main" >java.io.FileNotFoundException: /home/nwa/nutchwax/data/segments/*/fetcher/data > at org.apache.nutch.fs.LocalFileSystem.open(LocalFileSystem.java:93) > at >org.apache.nutch.io.SequenceFile$Reader.<init>(SequenceFile.java:194) > at >org.apache.nutch.io.SequenceFile$Reader.<init>(SequenceFile.java:187) > at org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:190) > at org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:179) > at org.apache.nutch.io.ArrayFile$Reader.<init>(ArrayFile.java:50) > at >org.apache.nutch.tools.UpdateDatabaseTool.updateForSegment(UpdateDatabaseTool.java:92) > at >org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:366) >050914 231242 parsing file:/home/nwa/nutchwax/conf/nutch-default.xml > >l. > > |