#10 HTML parser does not parse text in tags

resolved
closed-fixed
htdig (103)
5
2001-09-14
2001-03-01
Geoff Hutchison
No

Date: Wed, 27 Oct 1999 23:41:44 -0500
To: htdig3-bugs@htdig.org
From: Geoff Hutchison <ghutchis@wso.williams.edu>
Subject: Fwd: Re: [htdig] Leading reasons for htdig not
finding known

>From: Gilles Detillieux <grdetil@scrc.umanitoba.ca>
>To: htdig@htdig.org
>Delivered-To: htdig@htdig.org
>Date: Wed, 27 Oct 1999 09:11:21 -0500 (CDT)
>Cc: grdetil@scrc.umanitoba.ca, htdig@htdig.org
>Subject: Re: [htdig] Leading reasons for htdig not
finding known matches?
>Sender: htdig@htdig.org
>
>
>According to David J. Adams:
> > I have checked back on our previous
correspondence. The
> > problem was indeed for meta descriptions, and may
not have
> > included the enter head section, but it was
definitely
> > htdig version 3.1.2.
>
>Yes, the problem with punctuation in meta descriptions
is still there.
>It hasn't yet been fixed in either the 3.1.x or 3.2.x
development source
>trees. Someone will need to take the time to fix
htdig/HTML.cc to do
>proper parsing of words in meta descriptions. The
current, simplistic
>approach using strtok() just doesn't cut it. I think
the same problem
>exists with img alt text handling in 3.2 as well, so a
general and
>reusable fix is needed. Any takers?

Discussion

  • Logged In: YES
    user_id=149687

    This was fixed in Feb. 2000, and is in the current releases.

     
    • milestone: --> resolved
    • assigned_to: nobody --> grdetil
    • status: open --> closed-fixed