From: Manuel L. <ml...@ac...> - 2004-11-08 00:32:00
|
Hello, On 11/05/2004 02:41 PM, Neal Richter wrote: > htdig -i forces a 'from scratch' recrawl. > > htdig be default does a traversal of the existing index and issues HEAD > requests to see if a page has changed. Exactly what you described below... > > Please make sure you have 'head_before_get' enabled. > > What version are you using? 3.1.6 . Ok, but my idea was to avoid making htdig go through 10000 HEAD requests. If possible, tell htdig to just crawl a few pages that I supply the addresses because I know which are the ones that changed. Then it would just add or update the index with the updated pages and any new linked pages that it may find . Is this possible? -- Regards, Manuel Lemos PHP Classes - Free ready to use OOP components written in PHP http://www.phpclasses.org/ PHP Reviews - Reviews of PHP books and other products http://www.phpclasses.org/reviews/ Metastorage - Data object relational mapping layer generator http://www.meta-language.net/metastorage.html |