From: Gérard D. <ger...@gm...> - 2011-05-13 13:55:48
|
Hi all, I'm facing a new problem while setting up wayback on a set of WARCs created by heritrix. Basically, everthing goes fine, the weapp is well deployed and it correctly foun the new warcs as they are closed. However, something appends during the "copy/merge" and the files I can see in the wayback temp folder are only ~222 bytes whereas the roginal ones where 1Mb. I can only see the following warning in the logs: INFO: Renamed merged file /tmp/wayback/index-data/incoming/WEB-20110420084401772-00000-3811~weblab9~8888.warc.gz to /tmp/wayback/index-data/merged/WEB-20110420084401772-00000-3811~weblab9~8888.warc.gz May 13, 2011 2:52:15 PM org.archive.wayback.resourceindex.bdb.SearchResultToBDBRecordAdapter adapt WARNING: FAILED canonicalize(http://filedesc:WEB-20110502133909048-00000-5501 ~weblab9~8888.warc.gz:WEB-20110502133909048-00000-5501~weblab9~8888.warc.gz) If anyone already faced that problem, help is welcome. cheers, -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document & Learning team - LITIS Laboratory |