|
From:
<mat...@ce...> - 2005-12-13 16:34:00
|
Hi, we still have a few problems with output from nutchwax... I'd like to ask you for little help. XML output from nutchwax servlet http://war.mzk.cz:8080/nutchwax/opensearch?query=gradu%C3%A1l+louck%C3%BD&hitsPerDup=1&hitsPerPage=10 consists of html entities like Graduál is that right? Nutchwax should retrieve xml with html entities?(instead of characters in utf-8? like gradu%C3%A1l ) what's the difference between these cases? 1) http://war.mzk.cz:8080/nutchwax/opensearch?query=gradu%C3%A1l+louck%C3%BD&start=0&hitsPerDup=0&hitsPerPage=10&dedupField=exacturl ->output is not valid xml(called from WERA) 2) http://war.mzk.cz:8080/nutchwax/opensearch?query=gradu%C3%A1l%20louck%C3%BD&start=0&hitsPerPage=10&hitsPerDup=1&dedupField=exacturl output is valid xml(called from Nutchwax search.jsp) Another confusing issue for me:) characters in entity "title" are well-displayed, but text in entity "description" consist of html entities(as i described above) thanks for any help lukas |