#10 HTML parser does not parse text in tags

htdig (103)

Date: Wed, 27 Oct 1999 23:41:44 -0500
To: htdig3-bugs@htdig.org
From: Geoff Hutchison <ghutchis@wso.williams.edu>
Subject: Fwd: Re: [htdig] Leading reasons for htdig not
finding known

>From: Gilles Detillieux <grdetil@scrc.umanitoba.ca>
>To: htdig@htdig.org
>Delivered-To: htdig@htdig.org
>Date: Wed, 27 Oct 1999 09:11:21 -0500 (CDT)
>Cc: grdetil@scrc.umanitoba.ca, htdig@htdig.org
>Subject: Re: [htdig] Leading reasons for htdig not
finding known matches?
>Sender: htdig@htdig.org
>According to David J. Adams:
> > I have checked back on our previous
correspondence. The
> > problem was indeed for meta descriptions, and may
not have
> > included the enter head section, but it was
> > htdig version 3.1.2.
>Yes, the problem with punctuation in meta descriptions
is still there.
>It hasn't yet been fixed in either the 3.1.x or 3.2.x
development source
>trees. Someone will need to take the time to fix
htdig/HTML.cc to do
>proper parsing of words in meta descriptions. The
current, simplistic
>approach using strtok() just doesn't cut it. I think
the same problem
>exists with img alt text handling in 3.2 as well, so a
general and
>reusable fix is needed. Any takers?


  • Gilles Detillieux

    Logged In: YES

    This was fixed in Feb. 2000, and is in the current releases.

  • Gilles Detillieux

    • milestone: --> resolved
    • assigned_to: nobody --> grdetil
    • status: open --> closed-fixed

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks