I would like htdig to index about 2000 internet (outside my server) pages. I have compiled a file with all the urls and used this file as start_url and limit_url_to, which means to only index the pages pointed to by the url. Then I run the command rundig -v, when rundig echoes out the date, it will hang w/o further output. If I use a start_url file containing about 50 urls, it can finish the indexing within 2-3 min with timely verbose output. I wonder why it won't work with more urls.
If it works with 50 urls, I can split my urls into different files and run them one by one, the problem is I have to edit the conf file once for each url file, is it possible to specify the start_url file by command line input? Another problem is whether next run of rundig will erase the database created by a previous run or just update it.
Thank you for any clues.
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.