#246 Suggestion for display of excerpts

open
htsearch (60)
5
2004-11-26
2004-11-22
Anonymous
No

We have the the max_excerpts configuration variable set
to 2. We are using ht://Dig 3.2.0b5 on a Mac OS X
Server system.

Often when an excerpt is displayed for a search result,
it is followed by another excerpt for the same page
that is essentially the same text. The second excerpt
starts a word or two further than the first excerpt,
and, ends a word or tow beyond the text of the first.
The result is that the same information is displayed in
both excerpts. One note is that the excerpts in this
situation seem to have more than one occurence of a
highlighted search keyword.

Suggestion:
To find a second excerpt, begin the search for new text
beyond the last excerpt's displayed text.

To see an example of what is happening, browse to this url:
http://www.ipm.ucdavis.edu/GENERAL/search.html
and enter the search keywords: russian thistle

Thanks,

Buz Dreyer, Statewide IPM Program, University of California
wfdreyer@ucdavis.edu

Discussion

    • labels: --> htsearch
    • assigned_to: nobody --> grdetil
     
  • Logged In: YES
    user_id=149687

    I haven't tested this yet, but the patch below should fix
    the problem you report. It starts the search for the next
    word after the current excerpt, rather than after the
    current word. There may still be some overlap in the
    excerpts, but likely a lot less than before. Please give it
    a try and let us know.

    --- htsearch/Display.cc.orig 2004-06-19 07:40:42.000000000 -0500
    +++ htsearch/Display.cc 2004-11-26 15:26:29.000000000 -0600
    @@ -1897,7 +1897,7 @@ Display::buildExcerpts( StringMatch *all
    }

    // No more words left to examine in head
    - if ( (lastPos = curPos + termLength) > headLength )
    + if ( (lastPos = end+1 - head) > headLength )
    break;
    }