|
From: Gerhardt, M. <Mat...@sb...> - 2007-12-06 15:56:32
|
Hi @all, =20 I'd like to run NutchWax in corporation with Wera to search through my = arc-files That are created with Heritrix.=20 The crawl of the arc files looks fine, my opionion. The indexing with hadoop and nutchwax.war looks fine, too. It will be created the indexes and the segments for nutch. If I try a first search with nutchwax it feels good as well, and I get = the first information=20 >From the index. Further searching through the fulltext - if I'm trying = to get it - is=20 Honoured with an error-message =20 HTTP Status 404 - /blubb/20071203121503/http://crossasia.org/en/home/ type Status report message /blubb/20071203121503/http://crossasia.org/en/home/ description The requested resource = (/blubb/20071203121503/http://crossasia.org/en/home/) is not available. =20 You can just have a look at http://ogea.crossasia.org:8080/nutchwax/ to = get a feeling for it. =20 If I'm trying to get information from Wera I get similar results, the = list is ok, but the results in the archived version Isn't showing at all. I got a white sheet. If I'm trying to get the = metadata from the chosen site in the chosen timeline, I get an error : Failed to open stream.=20 =20 Have a look here = http://ogea.crossasia.org/wera/result.php?auto=3Don&meta=3Don&query=3Dcro= ssasia&url=3Dhttp%3A%2F%2Fcrossasia.org%2Fen%2Fhome%2F&time=3D20071203111= 503&level=3D6&autolevel=3D5&manlevel=3D5&autocheckbox=3D1&metacheckbox=3D= 1 =20 =20 What kind of misconfiguration could I have done? Any help there outside for me ? =20 Kind regards, =20 Matthias Gerhardt ___________________________________________ =20 technischer Leiter Virtuelle Fachbibliothek Ostasien =20 Staatsbibliothek zu Berlin - Preu=DFischer Kulturbesitz 10772 Berlin, Germany Telefon: +49(0)30-266-2496=20 E-Mail : mat...@sb... = <mailto:mat...@sb...>=20 ___________________________________________ =20 |