|
From: Oskar G. <Osk...@kb...> - 2008-02-15 11:03:42
|
Hi! I've downloaded WB 1.2.0 (which is excellent by the way) and got it working right away with arc-files. But when I later turned my attention warc-files (downloaded with Heritrix 1.12.1) I couldn't get it to work at first. When using "warc-indexer" to create cdx- files it just spitted out a NullPointerException and crashed. After some browsing through the code and testing I found that it was caused by the fact that the method getUrl() in ArchiveRecordHeader in the jar-file commons-2.0.1-SNAPSHOT.jar ALWAYS returned null. Why that is I haven't looked at, but it caused line 300 in WARCRecordToSearchResultAdapter.java ( String origHost = uriStr.substring(WaybackConstants.DNS_URL_PREFIX.length()); ) to throw the NPE. The solution for me was to replace "commons-2.0.1-SNAPSHOT.jar" with "heritrix.1.12.1.jar" and then it worked fine. Best regards, Oskar |