Share

Heritrix: Internet Archive Web Crawler

Tracker: Feature Requests

5 Improved handling when alloted runtime is exceeded - ID: 1518583
Last Update: Comment added ( karl-ia )

Currently the 'max-time' feature will terminate a crawl
once it has exceeded the value assigned to it.

It would be useful to be able to also pause jobs once
the limit is reached or to 'block URIs' the way the
QuotaEnforcer does.


Kristinn Sigurdsson ( kristinn_sig ) - 2006-07-07 08:34

5

Closed

None

Kristinn Sigurdsson

Configuration

1.10.0

Public


Comments ( 2 )

Date: 2007-03-14 01:48
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-1019 -- please add further
comments at that location.


Date: 2006-07-07 09:22
Sender: kristinn_sigProject Admin

Logged In: YES
user_id=892643

Resolved by adding a new processor; RuntimeLimitEnforcer.

CVS commit message:
Implementing RFE [ 1518583 ] Improved handling when alloted
runtime is exceeded
*
src/java/org/archive/crawler/prefetch/RuntimeLimitEnforcer.java
A new Processor that makes it possible to configure
Heritrix to
a) Pause job,
b) Terminate job, or
c) Block URIs (similar to QuotaEnforcer) once a set
runtime has been exceeded.
* src/conf/modules/Processor.options
Added reference to RuntimeLimitEnforcer processor


Attached File

No Files Currently Attached

Changes ( 3 )

Field Old Value Date By
artifact_group_id 0.10.0 2006-08-22 12:14 kristinn_sig
status_id Open 2006-07-07 09:22 kristinn_sig
close_date - 2006-07-07 09:22 kristinn_sig