From: Mohler, J. <jef...@ne...> - 2001-11-02 20:04:53
|
Thanks: I'll work on digesting all this information over the coming days. When you say local_urls, are you speaking of feeding rundig/htdig a directory path to parse instead of a URL itself? -----Original Message----- From: Gilles Detillieux [mailto:gr...@sc...] Sent: Friday, November 02, 2001 11:53 AM To: jef...@ne... Cc: htd...@li... Subject: Re: [htdig] General question on database updates.. According to Mohler, Jeff: > As I understand it, the database will only reflect new content added > to a WWW site if a full rebuild is done..correct? No, you can do update digs on an existing database. You just can't use the standard rundig script for this, because it passes the -i option to htdig. You need to call htdig and htmerge directly, or come up with a different script for doing update digs. There's at least one in the "Contributed Works" section of the www.htdig.org web site. > Just trying to find the most efficient way to update my many-hundred > mailing list archives. > > I use mhonarc to split domo archives into html files in thier own > trees, then have htdig build databases for each dl-list (each has > thier own tree). Depending on the total overall size, and whether you use local_urls or not to speed up local indexing, a standard update dig may do the trick. Update digs will look up every URL in the database to see if the document changed since it was last indexed. This is reasonably quick, especially if you use local_urls. However, for huge mailing list archives, where the majority of files never change once archived, some users prefer to split up the job to only index the last day, week or month of data, and then use htmerge -m to merge it in to the master database. See 4.4 and 4.5 in the FAQ on the web site. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 |