Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

8 CachedBdbMap NPE killing off threads. - ID: 1208804
Last Update: Comment added ( karl-ia )

From Tom Emerson reporting on count of threads
shrinking over a long-running crawl:

THere have been no alerts. However, there are 11 NPEs
in the log:

java.lang.NullPointerException
at
org.archive.util.CachedBdbMap$SoftEntry.clearPhantom(CachedBdbMap.java:508)

at
org.archive.util.CachedBdbMap.expungeStaleEntry(CachedBdbMap.java:456)
at
org.archive.util.CachedBdbMap.get(CachedBdbMap.java:343)
at
org.archive.crawler.frontier.BdbFrontier.next(BdbFrontier.java:514)
at
org.archive.crawler.framework.ToeThread.run(ToeThread.java:135)

These would account for the 11 missing threads.

Thanks.

-tree

P.S. Status --- 3,943,333 of 11,406,315 documents
downloaded. 46,389 queues.


Michael Stack ( stack-sf ) - 2005-05-25 22:38

8

Closed

None

Karl Thiessen

General

1.6.0

Public


Comments ( 4 )

Date: 2007-03-14 00:53
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-420 -- please add further
comments at that location.


Date: 2005-05-31 23:15
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

Assigning to Karl... but as a subtle synchronization/timing
bug that'd only be expected to show up in busy, long-running
crawls, how should we consider this 'closed'/verified?


Date: 2005-05-26 18:08
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

Probably fix committed. Commit comment:

Fix for [ 1208804 ] CachedBdbMap NPE killing off threads.
* WorkQueuesFrontier.java
ensure allQueues is a synchronized map


Date: 2005-05-26 01:32
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

As the comment for CachedBdbMap notes, "Requires external
synchronization."

The implicated get() isn't synchronized against other get()s
from other threads; since each get() also involves cleanup
of the soft entries, another thread could overtake the
thread that's just determined it needs to expunge a
particular entry, and expunge that entry as part of a full
expunge, causing the NPE.

Probably easiest fix is to wrap allQueues in a
synchronizedMap. (Was code like this previously?) A narrower
fix might be possible with further investigation.


Attached File

No Files Currently Attached

Changes ( 5 )

Field Old Value Date By
status_id Open 2005-12-02 17:14 stack-sf
close_date - 2005-12-02 17:14 stack-sf
artifact_group_id None 2005-09-23 18:29 gojomo
assigned_to gojomo 2005-05-31 23:15 gojomo
assigned_to nobody 2005-05-26 01:32 gojomo