#246 Suggestion for display of excerpts

htsearch (60)

We have the the max_excerpts configuration variable set
to 2. We are using ht://Dig 3.2.0b5 on a Mac OS X
Server system.

Often when an excerpt is displayed for a search result,
it is followed by another excerpt for the same page
that is essentially the same text. The second excerpt
starts a word or two further than the first excerpt,
and, ends a word or tow beyond the text of the first.
The result is that the same information is displayed in
both excerpts. One note is that the excerpts in this
situation seem to have more than one occurence of a
highlighted search keyword.

To find a second excerpt, begin the search for new text
beyond the last excerpt's displayed text.

To see an example of what is happening, browse to this url:
and enter the search keywords: russian thistle


Buz Dreyer, Statewide IPM Program, University of California


  • Gilles Detillieux

    • labels: --> htsearch
    • assigned_to: nobody --> grdetil
  • Gilles Detillieux

    Logged In: YES

    I haven't tested this yet, but the patch below should fix
    the problem you report. It starts the search for the next
    word after the current excerpt, rather than after the
    current word. There may still be some overlap in the
    excerpts, but likely a lot less than before. Please give it
    a try and let us know.

    --- htsearch/Display.cc.orig 2004-06-19 07:40:42.000000000 -0500
    +++ htsearch/Display.cc 2004-11-26 15:26:29.000000000 -0600
    @@ -1897,7 +1897,7 @@ Display::buildExcerpts( StringMatch *all

    // No more words left to examine in head
    - if ( (lastPos = curPos + termLength) > headLength )
    + if ( (lastPos = end+1 - head) > headLength )


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks