|
From: Kristinn S. <kr...@ar...> - 2005-11-03 09:04:18
|
Looks like you managed to fix the problem on your end.=20 The issue of 0/0 versions is tied to the incorrect coding of characters = in the XML (namely & being double escaped to &amp; rather then just = &). This causes any URIs that contain & (or other special characters = like < and >) to show up as 0/0 versions. Any idea what fixed the problem? - Kris > -----Original Message----- > From: arc...@li...=20 > [mailto:arc...@li...]=20 > On Behalf Of Luk=C3=A1=C5=A1 Mat=C3=ACjka > Sent: 3. n=C3=B3vember 2005 08:30 > To: Sve...@nb... > Cc: arc...@li... > Subject: RE: [Archive-access-discuss] wera results >=20 >=20 >=20 >=20 > ______________________________________________________________ > > Od: Sve...@nb... > > Komu: mat...@ce... > > CC:=20 > > Datum: 02.11.2005 19:41 > > P=C5=99edm=C4=9Bt: RE: [Archive-access-discuss] wera results > > > > I tried the latest opensearch servlet myself. It messed up=20 > my Wera, lots > > of 0/0 ... > >=20 > > ;-) >=20 >=20 > now, i'm using what you send to me...and everything seems fine... > i can't find any 0/0 :) >=20 > i will test it more:) >=20 > -lm >=20 > >=20 > > Sverre > >=20 > >=20 > > -----Original Message----- > > From: Luk=C3=A1s Matejka [mailto:mat...@ce...] > > Sent: Wed 11/2/2005 4:43 PM > > To: Sverre Bang > > Cc: arc...@li... > > Subject: RE: [Archive-access-discuss] wera results > >=20 > >=20 > >=20 > > ______________________________________________________________ > > > Od: sve...@nb... > > > Komu: arc...@li... > > > CC:=20 > > > Datum: 02.11.2005 14:33 > > > Predmet: RE: [Archive-access-discuss] wera results > > > > > > Hi there, > > > Definitely something wrong in NutchWax. If i execute > > > > >=20 > http://war.mzk.cz/~nwa/wera/wera/index.php?query=3Dkniha&year_fr > om=3D&year_to=3D > > > and click the tmeline link of the first hit showing 0/0 hits i get > >=20 > > where did you find hit showing 0/0? > > it works fine for me(i've just explored 150 urls..and no 0/0 hits ) > > did you remeber number of total hits?(if it's same - i=20 > experimented with > > previous version of nutchwax,starting tomcat on various instances) > >=20 > > i had for word "kniha" > > Total number of versions found : 49087. Displaying URL's 1-10 > >=20 > > -lm > >=20 > > > 'Sorry, no documents with the given uri were found'. The=20 > url displyed > > > seems fine, but if you look in the source of the=20 > uppermost frame you > > > will see that the url sent to the script was > > > http://full.nkp.cz/nkdb/rejstriky/rejstrik.asp?irj=3D12&start=3DV. > > > The & separating the parameters irj and start has been=20 > replaced by its > > > html character entity reference.=20 > > >=20 > > > If i press the go button now the url submitted to the=20 > script will be ok. > > >=20 > > > If i look in the NutchWax result set of the initial=20 > search (add &debug=3D1 > > > to the search url to bring out the NutchWax search urls)=20 > i see that the > > > url (link element) returned is wrong already here. > > >=20 > > > Conclusion : NutchWax mangles the url returned by introducing html > > > entities instead of keeping the url in its original form. > > >=20 > > > What version of NutchWax are you using? > > >=20 > > > Sverre > > >=20 > > > On Wed, 2005-11-02 at 12:41 +0000, Kristinn Sigurdsson wrote: > > > > This looks like the same (or very similar) problem as=20 > I've got. I've > > > been discussing it (offlist) with Stack and Sverre Bang,=20 > so I know it is > > > being looked into. > > > >=20 > > > > I notice in your search results (as in mine) that URIs=20 > with & in them > > > are showing up as 0/0 versions. I believe that both=20 > problems are due to > > > the escaping (or unescaping) of HTML characters in the=20 > NutchWAX XML that > > > is used to pass the results to WERA. > > > >=20 > > > > Possibly this is a misconfiguration of either Tomcat or=20 > Apache...? > > > >=20 > > > > - Kris > > > >=20 > > > > > -----Original Message----- > > > > > From: arc...@li...=20 > > > > > [mailto:arc...@li...]=20 > > > > > On Behalf Of LukAALA MatAZjka > > > > > Sent: 2. nAlvember 2005 11:21 > > > > > To: arc...@li... > > > > > Subject: [Archive-access-discuss] wera results > > > > >=20 > > > > >=20 > > > > > Hi, > > > > >=20 > > > > > for example > > > > > = http://war.mzk.cz/~nwa/wera/wera/index.php?query=3Dkniha&year_fr > > > > om=3D&year_to=3D > > > >=20 > > > > description of each record is not well-displayed > > > >=20 > > > > 1. SKIP, Moje kniha (http://skip.nkp.cz/akcMojekn.htm) > > > > (<b> ... </b>pr=C3=ADstupu k internetu v knihovn=C3=A1ch > > > propagovat vyuzit=C3=AD internetu pri > > > zjistov=C3=A1n=C3=AD n=C3=A1zoru obyvatel 2. Anketa > > > Pomoc=C3=AD kr=C3=A1tk=C3=A9 ankety bude zjistov=C3=A1na > > > nejobl=C3=ADbenejs=C3=AD <b>kniha</b> obyvatel > > > Cesk=C3=A9 republiky. Pojem nejobl=C3=ADbenejs=C3=AD > > > <b>kniha</b> je specifikov=C3=A1n dals=C3=ADmi v=C3=BDklady, > > > jako "<b>kniha</b>, kter=C3=A1 me nejv=C3=ADce > > > ovlivnila", "<b>kniha</b>, ke kter=C3=A9 se casto > > > vrac=C3=ADm", "<b>kniha</b>, kterou bych doporucil/a > > > dobr=C3=BDm pr=C3=A1telum", "<b>kniha</b>, > > > kter=C3=A1 zmenila muj zivot", "<b>kniha</b> na > > > kterou nemohu zapomenout", "<b>kniha</b>, kter=C3=A1 mne uvedla > > > do jin=C3=A9ho sveta", "<b>kniha</b>, kterou bych si s > > > sebou vzal/a jako jedinou<b> ... </b>) > > > > Versions (matching query/total) 3/3 > > > > Timeline | Overview > > > >=20 > > > > "pr=C3=ADstupu" should be "pLA=C2=ADstupu"(without diacritics > > > "pristupu") > > > >=20 > > > > does anybody have same problem? > > > >=20 > > > > -lm > > > >=20 > > > >=20 > > > >=20 > > > > ------------------------------------------------------- > > > > SF.Net email is sponsored by: > > > > Tame your development challenges with Apache's Geronimo=20 > App Server. > > > Download > > > > it for free - -and be entered to win a 42" plasma tv or=20 > your very own > > > > Sony(tm)PSP. Click here to play:=20 > http://sourceforge.net/geronimo.php > > > >=20 > _______________________________________________ > > > > Archive-access-discuss mailing list > > > > Arc...@li... > > > >=20 > https://lists.sourceforge.net/lists/listinfo/archive-access-di scuss > > >=20 > > >=20 > > >=20 > > > ------------------------------------------------------- > > > SF.Net email is sponsored by: > > > Tame your development challenges with Apache's Geronimo App = Server. > > Download > > > it for free - -and be entered to win a 42" plasma tv or your very = own > > > Sony(tm)PSP. Click here to play: = http://sourceforge.net/geronimo.php > > > _______________________________________________ > > > Archive-access-discuss mailing list > > > Arc...@li... > > > = https://lists.sourceforge.net/lists/listinfo/archive-access-discuss > >=20 > >=20 > > ------------------------------------------------------- > > SF.Net email is sponsored by: > > Tame your development challenges with Apache's Geronimo App Server. > > Download > > it for free - -and be entered to win a 42" plasma tv or your very = own > > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > > _______________________________________________ > > Archive-access-discuss mailing list > > Arc...@li... > > https://lists.sourceforge.net/lists/listinfo/archive-access-discuss > >=20 >=20 >=20 >=20 > ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. = Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Archive-access-discuss mailing list Arc...@li... https://lists.sourceforge.net/lists/listinfo/archive-access-discuss |