#14 fix for <a> et al parsing

closed
nobody
None
5
2003-08-28
2003-07-03
Pavel Kankovsky
No

The lexer ignores the contents of <a>, <img> and <font>
when the name of the tag is followed by a newline (or a
tab). E.g.

$ ./bogolexer -P t
normal mode.
Content-Type: text/html
get_token: 2 'Content-Type`
get_token: 2 'text`
get_token: 2 'html`

<a xyz>
get_token: 2 'xyz`
<a
xyz>
4 tokens read.

The patch makes it accept any kind of whitespace after
the tag's name.

Discussion

  •  
    Attachments
  • David Relson
    David Relson
    2003-07-05

    Logged In: YES
    user_id=30510

    Pavel,

    This change has been added to the bogofilter source and has
    been committed to cvs. It will be in the next release of
    bogofilter.

    Thanks for finding the problem and providing a patch.

    David

     
  • David Relson
    David Relson
    2003-08-28

    • status: open --> closed