|
From: Kaisa K. <kau...@cc...> - 2006-11-06 11:54:44
|
I deleted everything old and started a new index and now the log shines with the golden words 'nutchwax finished'. I vaguely remember deleting old indexes every now and then when testing different versions of hadoop+nutchwax, but probably didn't do it when really needed. Ok the one arc test went smoothly with hadoop-0.5.0 + nutchwax-0.7.0-200611030343 Next I'll try to index the whole of our library's recent mini size music archive. Many thanks, Kaisa On Sat, 4 Nov 2006, Michael Stack wrote: > When you changed hadoop+nutchwax combinations, did you clean the target > directory of all previous outputs? What I see in the log below is that the > import works fine but when we move to do the crawldb update, its complaining > that the sequencefiles its being fed don't jibe with what it already > digested. Was there a crawldb already in-place made with a different > version of hadoop? > > You should use the latest nutchwax build+hadoop-0.5.0. Current nutchwax is > based on the nutch 0.8.1 release. Nutch 0.8.1 is built against hadoop-0.5.0. > Nutch and Hadoop are moving at different rates. > The latest nutchwax+hadoop-0.5.0 is what we're currently using internally > running a large indexing job: ~800milllion documents. We're learning lots > operating at this new scale. I'll try and summarize our findings and post > them alongside the new release when it goes out (Should happen when this big > job completes -- in a week or so). > > Yours, > St.Ack > > > > Kaisa Kaunonen wrote: >> >> Hi all, >> >> I don't seem to find a combination of hadoop-0.5.0 and >> nutchwax-0.6.x or nutchwax-0.7.x that would index on my >> machines. >> >> hadoop-0.5.0 + nutchwax-0.6.1 (latest official) fails >> (for different reasons than 0.7.0-200611030343) >> >> hadoop-0.5.0 + nutchwax-0.7.0-200611030343 (latest build artifact) fails >> >> Attached log from the 0.7.0 run when trying to index one arc. >> The run stops by saying 'A record version mismatch occurred. >> Expecting v3, found v5' >> >> >> Best, >> Kaisa Kaunonen >> Nat.Lib.Finland |