My htdig-indexed site is mirrored. So far, I've asked
the mirror maintainers to run rundig on each of their
mirrors nightly after retrieving the latest updates --
but some of the mirrors can take hours to execute
rundig. It would be trivial to have the mirrors pick
up the indices generated by rundig on the master site,
but then users of the mirrors would get search results
that point back to the master site (defeating the
purpose of the mirrors, to distribute the load and to
give better service to those with poor connections to
the master site).
So what would help is to be able to rewrite the URLs
retrieved by htsearch on-the-fly, replacing the
master hostname in each one with the mirror's hostname.
I'm thinking of doing this with sed (or similar)
postprocessing of the htsearch output, but there must
be a cleaner way to do this.
Here's another way to accomplish this end: when
generating indices, recognize and encode the current
hostname as <HOST> (or the distinctive token of your
choice); then allow a setting in htdig.conf to control
how <HOST> is expanded by htsearch. Not as general as
a full-fledged URL rewriter, but perhaps easier to