|
From: Gilles D. <gr...@sc...> - 2003-10-28 00:06:18
|
I've pretty much run out of time to help out with this release, but before
I leave you, I thought I'd submit the following patch for your testing and
approval. It should fix the duplicate URL problem in htsearch collections,
in bug #504087. I'm not sure what sort of performance impact it will have
on large databases with lots of potential matches, though. That may be
something to consider/test for.
Cheers,
Gilles
--- htsearch/Display.cc.orig 2003-10-25 07:40:23.000000000 -0500
+++ htsearch/Display.cc 2003-10-27 17:55:52.000000000 -0600
@@ -1895,6 +1895,27 @@ Display::sort(List *matches)
qsort((char *) array, numberOfMatches, sizeof(ResultMatch *),
array[0]->getSortFun());
+ // In case there are duplicate URLs across collections, keep "best" ones
+ // after sorting them.
+ Dictionary goturl;
+ String url;
+ int j = 0;
+ for (i = 0; i < numberOfMatches; i++)
+ {
+ Collection *collection = array[i]->getCollection();
+ DocumentRef *ref = collection->getDocumentRef(array[i]->getID());
+ url = ref->DocURL();
+ HtURLRewriter::instance()->replace(url);
+ if (goturl.Exists(url))
+ delete array[i];
+ else
+ {
+ array[j++] = array[i];
+ goturl.Add(url, 0);
+ }
+ }
+ numberOfMatches = j;
+
const String st = config->Find("sort");
if (!st.empty() && mystrncasecmp("rev", st, 3) == 0)
{
--
Gilles R. Detillieux E-mail: <gr...@sc...>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada)
|