Share

Heritrix: Internet Archive Web Crawler

Tracker: Feature Requests

7 Add bdb alreadyseen option to hostsqueuesfrontier - ID: 1050378
Last Update: Comment added ( karl-ia )

Add an option to HostsQueuesFrontier that allows using
the BDB alreadyseen in place of the in-memory-based
already seen.


Michael Stack ( stack-sf ) - 2004-10-19 23:15

7

Closed

None

Michael Stack

None

None

Public


Comments ( 2 )

Date: 2007-03-14 01:35
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-840 -- please add further
comments at that location.


Date: 2004-10-25 18:16
Sender: stack-sfProject Admin

Logged In: YES
user_id=924942

Implemented. Below is the commit message.
[debord 439] heritrix > more /tmp/diff.txt
* src/java/org/archive/crawler/frontier/HostQueuesFrontier.java
Added a boolean option that defaults false but if its
true, we use
a BDB already-included.
* src/java/org/archive/crawler/util/BdbUriUniqFilter.java
Added constructor that doesn't take a cache percentage
sizing.
Converted DatabaseException to an IOException so no one
else has
to do BDBJE imports.



Attached File

No Files Currently Attached

Changes ( 2 )

Field Old Value Date By
status_id Open 2004-10-25 18:16 stack-sf
close_date - 2004-10-25 18:16 stack-sf