Share

Bayespam

The forum address has changed, you have been automatically redirected. Please update any bookmarks to use the new URL.

Subscribe

Problem with HTML mails

You are viewing a single message from this topic. View all messages.

  1. br7

    2002-09-17 13:22:18 UTC
    I have two mails from Paypal that I wished to
    receive but were classified as spam. Even after
    rebuilding the rating file with these two mails
    included als "good" mail they are still classified
    as spam.

    I analized this problem and I think I found the
    reason:
    Most of my spam mails are HTML mails. Most of my
    good mails are non-HTML, plain mails. The Paypal
    mails are - unlike most good mails - HTML mails.

    In the text there would be more than enough
    tokens to distinguish this mail from spam - BUT
    Bayespam rates several HTML tokens as
    "interesting", with a high "spamminess" (because
    they are so common in spam), so that other tokens
    from the text with low spamminess are swept aside,
    and thus the whole mail becomes spam.

    Bottom line: As long as Bayespam treats the HTML
    tokens of this mail the same way as the text
    tokens themselves it will classify these Paypal
    mails as spam, at least until I receive a lot
    more good HTML mails.

    Does Bayespam need a mode where it rips out
    everything between <> like it does now with
    HTML comments?
< Previous | 1 | Next >

Add a Reply

This forum does not allow anonymous participation.

Log in to add a reply. Not registered? Create an account to participate and receive email updates when replies are posted to this topic.