#68 noindex function

htdig (31)

It would be nice to have a feature that allows htdig to
crawl a portion of an html page but not index the
content of that particular section.

For example:

Most of my pages have a header (with links) and a left
navigation (with links). When performing a search from
my search page I often get excessive results because I
searched for a word that was a link in the left nav. So I
will get 5 or 6 pages that have actual content
containing the word I searched for and about 30 pages
that just have the word in the left navigation.

I tried using the noindex_start & noindex_end and
surrounded them around the leftnav section of each
page but now htdig will not crawl the links contained in
the leftnav.

A function simliar to noindex_start and noindex_end but
allow htdig to crawl any links found between
noindex_start and noindex_end would be great. In other
words, this *new* function would crawl the content
between noindex tags but would not store the content
in the htdig db files.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks