#63 BF should decode ul</x>timate

closed
nobody
None
5
2003-11-01
2003-11-01
Tim Freeman
No

If I feed this text into bogofilter -vvv:

From: innocent@victim.invalid

ul</x>timate

then the word "timate" appears on the word list, but
the word
"ultimate" does not. I think HTML end tags should
be ignored while parsing.

I've attached the entire original email.

Discussion

  • Tim Freeman

    Tim Freeman - 2003-11-01

    Original spam

     
  • David Relson

    David Relson - 2003-11-01
    • status: open --> closed
     
  • David Relson

    David Relson - 2003-11-01

    Logged In: YES
    user_id=30510

    Tim,

    HTML tags are ignored when processing text/html and the
    result is what you're asking for.

    If the message is identified as text/plain, then the special
    characters (angle brackets and slash) separate tokens. It
    would be wrong for bogofilter to process plain text as
    though it's html.

    As your example fits the second category above and
    bogofilter is working as intended, there's nothing to change.

    David

     

Log in to post a comment.