From: Manuel L. <ml...@ac...> - 2005-07-22 04:08:02
|
Hello, on 07/19/2005 08:10 PM Christopher Murtagh said the following: > On Tue, 2005-07-19 at 19:45 -0300, Manuel Lemos wrote: >>Great. I hope that will allow us to do things like making Htdig crawl >>individual pages and only update their entries in the index. That is >>what miss most in the current HTDig version. > > I'm using htdig 3.2 for doing incremental indexing right now and it > seems to be working fine. What sort of problems are you having? > > To remove a list of URLs: > > htpurge -c conf_file.conf -u list_of_urls.txt > > To do an incremental index: > > echo URL_list.txt | htdig -m foo -c conf_file.conf - > > (notice the trailing '-'). Making this work wasn't obvious, but I had a > bit of help from the list, and it's all working for me now. hummm... I had the impression from a message posted in this list that when you do incremental indexing, HtDig will still traverse all pages but just performs HEAD requests to verify whether other pages were updated. Is this what happens or I misunderstood the point of this? Another thing that confuses me about the example above is the parameter that follows the -m switch. If it is supposed to read from STDIN, why foo and not just - ? Other than that, if I want to update existing index database files, letting the users search the current databases while htdig is finishe, adding -a switch to the htdig command line will work ok whe just updating a few URLs as you suggest? Should I follow htdig command with the usual htmerge and htfuzzy command calls as in a full reindex? If this works ok like this, that will solve my problem as I need. If so, I plan to update my HTDIG PHP interface class and release a new version soon for the benefit of all that use HTDIG with PHP. http://www.phpclasses.org/htdiginterface -- Regards, Manuel Lemos PHP Classes - Free ready to use OOP components written in PHP http://www.phpclasses.org/ PHP Reviews - Reviews of PHP books and other products http://www.phpclasses.org/reviews/ Metastorage - Data object relational mapping layer generator http://www.meta-language.net/metastorage.html |