#191 no word break inserted when noindex_start/end text removed

Include_in_3.2
closed-wont-fix
nobody
htdig (103)
5
2003-10-26
2003-10-24
No

Subject: Re: [htdig] Masking out template content with
[noindex_start] and [noindex_end]

On Fri, Jul 4, 2003, Andreas.Mueller@bbw.admin.ch wrote:
> We use ht://Dig as search engine. To make sure that this
(to searches
> irrelevant) template content is not indexed we have made
heavy use of
> [noindex_start] and [noindex_end] -- in our case we used
'<!--
> htdig_noindex_end -->' and '<!-- htdig_noindex_start -->'.
>
> Now I found out that htdig does not seem to consider
these tags as white
> spaces forming separate words.

The fix is to patch htdig/HTML.cc to add a space after
stripping out the noindex_start ... noindex_end section. See
ftp://ftp.ccsf.org/htdig-patches/3.1.6/masking_noindex.0 for
the 3.1.6 fix. A similar approach would to the trick in 3.2.

This is a minor problem, only reported once, because these
tags are usually used each on their own separate line, so the
whitespace is already there most of the time. However, it's
an easy fix. The question is will some users ever expect/
count on the opposite behaviour, i.e. that no space be
inserted?

Discussion

  • Lachlan Andrew
    Lachlan Andrew
    2003-10-25

    Logged In: YES
    user_id=663373

    I think that the current behaviour gives users flexibility.
    It should just be documented that they don't count as
    whitespace, and that an explicit space should be added after
    the [noindex_end] if that is intended (assuming that a space
    works as well as a newline).

    Lachlan

     
  • Lachlan Andrew
    Lachlan Andrew
    2003-10-26

    Logged In: YES
    user_id=663373

    I've updated the documentation to describe the current
    behaviour, and how to achieve the behaviour that Andreas wanted.

     
  • Lachlan Andrew
    Lachlan Andrew
    2003-10-26

    • status: open --> closed-wont-fix