#68 noindex function

htdig (31)

It would be nice to have a feature that allows htdig to
crawl a portion of an html page but not index the
content of that particular section.

For example:

Most of my pages have a header (with links) and a left
navigation (with links). When performing a search from
my search page I often get excessive results because I
searched for a word that was a link in the left nav. So I
will get 5 or 6 pages that have actual content
containing the word I searched for and about 30 pages
that just have the word in the left navigation.

I tried using the noindex_start & noindex_end and
surrounded them around the leftnav section of each
page but now htdig will not crawl the links contained in
the leftnav.

A function simliar to noindex_start and noindex_end but
allow htdig to crawl any links found between
noindex_start and noindex_end would be great. In other
words, this *new* function would crawl the content
between noindex tags but would not store the content
in the htdig db files.


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks