Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

5 Serious error during crawling did not produce an alert - ID: 965622
Last Update: Comment added ( karl-ia )

The following error (captured in the heritrix_out.log)
did not generate a UI alert as it should have.

Errors like this one may go undetected if no alert is
created:

---
2.6.2004 15:44:32 org.archive.util.DiskQueue enqueue
SEVERE:
enqueue(CrawlURI(http://almenni-is-.teljari.is/teljari.php?eigandi=7923&sid
a=Ver%25f0m%25e6tasta+eignin))
#1 http://almenni.is/Default.aspx?nodeID=320 (0
attempts)
LL http://almenni.is/?nodeID=132
Current processor:
ACTIVE for 2s39ms
Where: ABOUT_TO_RETURN_URI

java.io.IOException: Interrupted system call
at
java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java(Inlined
Compiled Code))
at
org.archive.io.DiskByteQueue$FlipFileInputStream.setupStreams(DiskByteQueue
.java(Compiled
Code))
at
org.archive.io.DiskByteQueue$FlipFileInputStream.<init>(DiskByteQueue.java(
Inlined
Compiled Code))
at
org.archive.io.DiskByteQueue.getHeadStream(DiskByteQueue.java(Inlined
Compiled Code))
at
org.archive.util.DiskQueue.lazyInitialize(DiskQueue.java(Compiled
Code))
at
org.archive.util.DiskQueue.enqueue(DiskQueue.java(Compiled
Code))
at
org.archive.util.DiskBackedQueue.enqueue(DiskBackedQueue.java(Compiled
Code))
at
org.archive.crawler.frontier.KeyedQueue.enqueueUnqueued(KeyedQueue.java(Com
piled
Code))
at
org.archive.crawler.frontier.KeyedQueue.enforceMemoryLoad(KeyedQueue.java(I
nlined
Compiled Code))
at
org.archive.crawler.frontier.KeyedQueue.enqueue(KeyedQueue.java(Compiled
Code))
at
org.archive.crawler.frontier.Frontier.enqueueToKeyedInactive(Frontier.java(
Inlined
Compiled Code))
at
org.archive.crawler.frontier.Frontier.enqueueToKeyed(Frontier.java(Compiled

Code))
at
org.archive.crawler.frontier.Frontier.innerSchedule(Frontier.java(Compiled
Code))
at
org.archive.crawler.frontier.Frontier.innerBatchFlush(Frontier.java(Inlined

Compiled Code))
at
org.archive.crawler.frontier.Frontier.finished(Frontier.java(Compiled
Code))
at
org.archive.crawler.framework.ToeThread.run(ToeThread.java:143)
---


Kristinn Sigurdsson ( kristinn_sig ) - 2004-06-03 09:32

5

Closed

Fixed

Gordon Mohr

General

1.6.0

Public


Comments ( 5 )

Date: 2007-03-14 00:12
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-164 -- please add further
comments at that location.


Date: 2005-09-23 18:01
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

I'm moving this from 'out-of-date' to 'fixed' because other
changes Stack has made in the alerting system now intercept
all SERIOUS log events and make them alerts.


Date: 2005-06-23 13:33
Sender: kristinn_sigProject Admin

Logged In: YES
user_id=892643

The exact code that generated this error has now been
deprecated. While it is still useful to always create alerts
when serious errors occur, I know of no case where that
doesn't occur properly. Closing this as out of date.


Date: 2004-10-14 18:37
Sender: stack-sfProject Admin

Logged In: YES
user_id=924942

This note below was copied from '[ 1000929 ] fatal
runtimeexceptions in frontier give no info in web UI' --
https://sourceforge.net/tracker/index.php?func=detail&aid=1000929&group_id=73833&atid=539099
-- which was closed as a duplicate of this issue.

"The recent CCE problem was only evident in the stdout
log -- because it involved an uncaught runtime
exception (CCE) which triggered default behavior (dump
stack, end execution of that thread).

These should be caught and noted in the UI somehow.

(Additionally, looking at the Thread report gives no
indication that the threads are completely dead, rather
than just hung, because the state consulted for that
report is all still around. We probably need to add an
isAlive() check to that report -- and the general
thread-status assessment.)"


Date: 2004-06-03 22:11
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

Generally, we have too many uses of e.printStackTrace(), or
local logger dumps of errors, without proper error handling,
risking unnoticed errors which pile up.

As a general rule that is incrementally better than "just
printing", we may wish to wrap caught declared exceptions in
a RuntimeException, and throw that. Then, at least, our
catchall handler will receive it, log it (and generate a UI
alert), and restart the current thread from a well-defined
point.


Attached File

No Files Currently Attached

Changes ( 7 )

Field Old Value Date By
artifact_group_id None 2005-09-23 18:02 gojomo
resolution_id Out of Date 2005-09-23 18:01 gojomo
close_date - 2005-06-23 13:33 kristinn_sig
resolution_id None 2005-06-23 13:33 kristinn_sig
status_id Open 2005-06-23 13:33 kristinn_sig
priority 6 2004-07-07 21:13 gojomo
assigned_to nobody 2004-06-17 00:23 gojomo