When running in a mode that fully or partially ignores
robots.txt, it would be helpful in analyzing what's
being gained or what robots rules would be best to
follow (wihtout losing key content) if crawl.log
entries were annotated with an indicator of whether
robots.txt rules, if applied, would have precluded a fetch.
This could use the CrawlURI addAnnotation() facility.
This might be most easily configured on
PreconditionEnforcer as two separate robots policies:
one to honor, one to check and annotate without honoring.
Karl Thiessen
None
1.6.0
Public
|
Date: 2007-03-14 01:31
|
|
Date: 2005-11-03 02:27 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| status_id | Open | 2005-12-02 17:29 | stack-sf |
| close_date | - | 2005-12-02 17:29 | stack-sf |
| artifact_group_id | None | 2005-11-03 02:27 | gojomo |
| assigned_to | gojomo | 2005-11-03 02:27 | gojomo |
| priority | 6 | 2005-11-01 00:59 | gojomo |
| assigned_to | nobody | 2005-11-01 00:59 | gojomo |
| priority | 5 | 2004-09-01 21:51 | gojomo |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use