|
From: Sverre B. <sve...@nb...> - 2006-01-19 09:02:15
|
Hi, What version of WERA+NutchWAX are your using? Sverre On Wed, 2006-01-18 at 15:56 -0800, SourceForge.net wrote: > Bugs item #1409045, was opened at 2006-01-18 14:55 > Message generated for change (Comment added) made by dachs_remy > You can respond by visiting: > https://sourceforge.net/tracker/?func=detail&atid=681137&aid=1409045&group_id=118427 > > Please note that this message will contain a full copy of the comment thread, > including the initial issue submission, for this request, > not just the latest update. > Category: wera > Group: None > Status: Open > Resolution: None > Priority: 5 > Submitted By: Sverre Bang (sverreb) > Assigned to: Sverre Bang (sverreb) > Summary: [wera] Fragment identifiers in URI's not handled correctly > > Initial Comment: > Some URIs refer to a location within a resource. This > kind of URI ends with "#" followed by an anchor > identifier (fragment identifier). Wera does not handle > this correctly. See > http://nwa.nb.no/wera/result.php?time=&url=http%3A%2F%2Fwww.lib.helsinki.fi%2Fsibelius%2F > for an example (try one of the links in the displayed page) > > > ---------------------------------------------------------------------- > > Comment By: Remy Cristini (dachs_remy) > Date: 2006-01-19 00:56 > > Message: > Logged In: YES > user_id=1430678 > > I just got WERA running today (great!!) and encountered > (what seems to be) the same problem: when browsing through > different pages within an archived website, i get "Sorry, > no documents with the given uri were found" when the URI > ends with an identifier containing "&". It seems that the > exacturl request from WERA to NutchWAX loses the last part > of the uri when browsing (not when searching for the exact > uri!). > > Example: > > I click on a link and the URL in the address bar shows: > > http://localhost/wera/result.php? > time=20060116091645&mode=standalone&url=http://www.someserve > r.org/show.aspx?id=126&cid=5 > > However, the uri in the search field on the "sorry..." page > just shows: > > http://www.someserver.org/show.aspx?id=126 > > Of course when i add the &cid=5 part at the end of the > search string, the requested page shows without a problem. > Now the address bar shows: > > http://localhost/wera/result.php?url=http%3A%2F% > 2Fwww.someserver.org%2Fshow.aspx%3Fid%3D126%26cid% > 3D5&level=6&time=20060116091645 > > So there seems to be a difference in how a uri is requested > from the NutchWAX index, depending on whether it's typed > directly into the search field or when you browse from one > page to the other: it will then lose the last identifier. > > ---------------------------------------------------------------------- > > You can respond by visiting: > https://sourceforge.net/tracker/?func=detail&atid=681137&aid=1409045&group_id=118427 |