Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

5 Deadlock in Andy's 2nd Crawl - ID: 899019
Last Update: Comment added ( karl-ia )

Crawl in crawl18:/0/home/aboy/heritrix-0.4.1

This crawl eventually stopped making progress: the web
UI is responsive, but all ToeThreads are ready and
waiting for work while all Frontier classQueues are
SNOOZED.

There appears to have been a settings-related runtime
error at the time progress stopped. From heritrix_out.log:

209.237.240.32 - - [13/Feb/2004:17:48:15 +0000] "GET
/admin/main.jsp HTTP/1.1" 200 6458
java.lang.ArrayIndexOutOfBoundsException: 0
at
org.archive.crawler.datamodel.settings.XMLSettingsHandler.scopeToFile(XMLSe
ttingsHandler.java(Inlined
Compiled Code))
at
org.archive.crawler.datamodel.settings.XMLSettingsHandler.readSettingsObjec
t(XMLSettingsHandler.java(Compiled
Code))
at
org.archive.crawler.datamodel.settings.SettingsHandler.getSettingsObject(Se
ttingsHandler.java(Inlined
Compiled Code))
at
org.archive.crawler.datamodel.settings.SettingsHandler.getSettings(Settings
Handler.java(Compiled
Code))
at
org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java(Com
piled
Code))
at
org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java(Com
piled
Code))
at
org.archive.crawler.basic.Frontier.emitCuri(Frontier.java(Compiled
Code))
at
org.archive.crawler.basic.Frontier.next(Frontier.java(Compiled
Code))
at
org.archive.crawler.framework.CrawlController.run(CrawlController.java(Comp
iled
Code))
209.237.240.32 - - [13/Feb/2004:18:01:20 +0000] "GET
/admin/main.jsp HTTP/1.1" 200 6455
2


Gordon Mohr ( gojomo ) - 2004-02-17 19:34

5

Closed

Fixed

John Erik Halse

None

None

Public


Comments ( 2 )

Date: 2007-03-14 00:08
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-73 -- please add further
comments at that location.


Date: 2004-02-17 20:27
Sender: johnerikProject Admin

Logged In: YES
user_id=896276

This could occur if the hostname for some reason was just a
dot "."
Fixed it to return the global settings if that was the case.


Attached File

No Files Currently Attached

Changes ( 4 )

Field Old Value Date By
status_id Open 2004-02-17 20:27 johnerik
resolution_id None 2004-02-17 20:27 johnerik
close_date - 2004-02-17 20:27 johnerik
assigned_to nobody 2004-02-17 19:41 gojomo