From: Roger F. <rog...@ho...> - 2002-03-01 03:40:11
|
Ah, sweet, just moving the db.log file fixed it, you nailed it! Thanks very much! (Sorry to cc this to the list even though it's not much of a message, but I figured maybe someone might find it helpful in the future while searching the archives). Thanks! Rog. >According to Roger Filk: > > Hmm, yeah, I tried that, but this is what happens (maybe this is the > > expected behavior): > > I type: > > ./htdig -vv -m ../conf/start.url > > > > and then this shows up: > > New server: one.domain.com, 80 > > New server: two.domain.com, 80 > > .. and the list goes on for every domain I have already crawled, even > > though I only have 6 urls in the start.url file. Is that the expected > > result? Once it lists all those servers, it does seem to start indexing > > them (I tailed the logfile for the server the files are on and it's > > definitely hitting a lot of them, I'm assuming all of them). >... > > I forgot to mention in the previous reply to you that I am indeed using > > 3.1.6, just downloaded it the other day. I figured I should clear that > > up so you're not wondering if I use a later one. > >Well, this certainly isn't the expected behavior of 3.1.6's -m option, >but I can think of a possible explanation. If you aborted your earlier >run of htdig, before using the -m option, it would have saved a list of >URLs left to dig in the db.log file in your database_dir. A subsequent >run of htdig, even with -m, would reload this db.log file and dig at least >those URLs even if the hop count is 0. If you abort the htdig -m, it >will still save its list of URLs to dig in db.log, so the cycle goes on. > >The -m option probably ought to turn off the log loading feature, so the >log stays there for the next non-minimal dig. The problem with that, >though, >is that if you about a regular dig, then do a minimal dig and abort it too, >you won't be able to resume the previous regular dig. I guess you could >debate it either way. If you want to disable log file loading on -m, then >find the section of htdig/htdig.cc that looks like this: > > case 'm': > minimalFile = optarg; > max_hops = "0"; > break; > > >and add this line right after the max_hops = "0"; assignment: > > flags = Retriever_noLog; > >Or, if you just want to disable log file loading as a one-shot deal, you >can remove, rename or move the db.log file so htdig -m doesn't load it. > >Hope this helps. Please let us know if this isn't the cause and fix for >your problem. > >-- >Gilles R. Detillieux E-mail: <gr...@sc...> >Spinal Cord Research Centre WWW: >http://www.scrc.umanitoba.ca/~grdetil >Dept. Physiology, U. of Manitoba Phone: (204)789-3766 >Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 > >_______________________________________________ >htdig-general mailing list <htd...@li...> >To unsubscribe, send a message to ><htd...@li...> with a subject of unsubscribe >FAQ: http://htdig.sourceforge.net/FAQ.html _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. |