In heritrix_out.log:
11/11/2004 14:32:42 -0800 WARNING
org.archive.util.DevUtils warnHandle Failed to get
class key: Illegal domain label: $library
CrawlURI(dns:$library)
org.apache.commons.httpclient.URIException: Illegal
domain label: $library
at
org.archive.crawler.datamodel.UURIFactory.checkDomainlabel(UURIFactory.java
:480)
at
org.archive.crawler.datamodel.UURIFactory.fixup(UURIFactory.java:377)
at
org.archive.crawler.datamodel.UURIFactory.create(UURIFactory.java:254)
at
org.archive.crawler.datamodel.UURIFactory.create(UURIFactory.java:244)
at
org.archive.crawler.datamodel.UURIFactory.getInstance(UURIFactory.java:213)
at
org.archive.crawler.datamodel.CrawlURI.calculateClassKey(CrawlURI.java:396)
Here iw what was in recovery log that was responsible:
Fs
http://fdab.gsfc.nasa.gov/live/$library/carpenter_russell.jpg
...
F+ dns:$library LLLLLXLLLXELLLXRLLLEP
http://$library/carpenter_russell.jpg
Fe dns:$library
We should be failing way earlier on stuff like '$library'.
Gordon Mohr
uri
1.6.0
Public
|
Date: 2007-03-14 00:18
|
|
Date: 2005-09-21 23:00 Logged In: YES |
|
Date: 2005-09-21 22:47 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| artifact_group_id | None | 2005-09-23 18:01 | gojomo |
| status_id | Open | 2005-09-21 23:00 | gojomo |
| resolution_id | None | 2005-09-21 23:00 | gojomo |
| close_date | - | 2005-09-21 23:00 | gojomo |
| assigned_to | stack-sf | 2005-09-21 22:47 | gojomo |
| summary | [uuri] '$' in path does us in. | 2004-12-02 21:23 | gojomo |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use