Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

5 ARCWriter alerts if Content-Type is null - ID: 1123906
Last Update: Comment added ( karl-ia )

ARCWriter throws a "severe" alert if a document's
Content-Type is null.
However, this is not so uncommon (eg. redirects without
content body).

Please feel free to use the attached patch to correct
this (documents without Content-Type are silently
skipped, only a warning is written to the log file).


Christian Kohlschütter


Christian Kohlschütter ( ck-heritrix ) - 2005-02-16 12:38

5

Closed

Fixed

Nobody/Anonymous

General

None

Public


Comments ( 2 )

Date: 2007-03-14 00:21
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-364 -- please add further
comments at that location.


Date: 2005-02-17 23:11
Sender: stack-sfProject Admin

Logged In: YES
user_id=924942

Thanks for the report.

I removed the test for spaces in content-type because
another test happening downstream takes care of this case
(And the downstream method sets the mimetype to 'no-type' if
none supplied).

Closing.


Attached Files ( 2 )

Filename Description Download
arcwriter-mimetype-1 Patch for ARCWriterProcessor Download
arcwriter-mimetype-2 Minor patch for ARCWriter Download

Changes ( 5 )

Field Old Value Date By
status_id Open 2005-02-17 23:11 stack-sf
resolution_id None 2005-02-17 23:11 stack-sf
close_date - 2005-02-17 23:11 stack-sf
File Added 120267: arcwriter-mimetype-2 2005-02-16 13:28 ck-heritrix
File Added 120264: arcwriter-mimetype-1 2005-02-16 12:38 ck-heritrix