#48 ignore_noindex and ignore_nofollow

htdig (103)

For indexing an local server it would be handy to have
an option for the case that htdig should ignore the
noindex and nofollow metatags and the robots.txt

I think that the way via User-agent in the robots.txt
is not exact enough in same cases (e.g. an internal
used index where all files should be indexed and an
external index where some files shouldn't be indexed).


  • Geoff Hutchison

    Geoff Hutchison - 2002-01-04

    Logged In: YES

    I don't see how the user-agent and the robots.txt doesn't
    solve your problem. If you add an internal "my-htdig" name
    for your internal indexing, you could certainly set up rules
    to allow all files to be indexed.

    I think I'd need to know more about why you think this is a
    needed feature.

  • Geoff Hutchison

    Geoff Hutchison - 2002-02-12
    • assigned_to: nobody --> ghutchis
    • status: open --> closed-works-for-me
  • Geoff Hutchison

    Geoff Hutchison - 2002-02-12

    Logged In: YES

    I still do not see cases where this is a problem. Furthermore, I disagree that ht://Dig should ever *not* follow the standards for robot behavior, local pages or not. Certainly it's easier on a local server to have the page tags or robots.txt file changed to suit your indexing, esp. considering the robots.txt name can be set.



Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks