Re: Reusable PYZOR Components?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Jesus Cea Avion, on 2003-09-10, wrote:

> Any idea about the proportion of spam catched by pyzor?. In my case,
> from 720 messages received, 220 were spam. Of these, pyzor catches...
> only 53 :-(.

Depending on when you run pyzor over the message makes a difference.  If 
you do it at the SMTP level, you'll be processing the message likely 
before anyone has had the opportunity to report it, since spammers tend to 
send a burst.

> I get a lot of spam in Spanish, of course.

The fact that that spam is in Spanish makes little difference, I would
suspect.  We all get a lot of spam in all languages.

> Soon, I'll use old abandoned email addresses targeted by spam to feed
> pyzor network.

Great.  I know that a few others do the same, as well as myself.  It helps
to trounce around such addresses on newsgroups.

> How currently pyzor behaves with malicious users sending bogus hashes to
> contaminate the database?.

Not very well.  Fortunately, it hasn't been an issue.  I must point out
that there three general types of false positives:

1) a legitimate bulk email (e.g., mailing list).
   Recommended solution: this expected mail, so it can and should be 
                         whitelisted, as its nature can be predicted.
2) a legitimate form mail (e.g., order receipt).
   This is the hardest to handle sanely, since you might not know how to 
   whitelist this in advance.  Additionally, it is likely very similar to 
   other messages that others received, with only a couple differing 
   characters, which might not make it into Pyzor's digest.
   I can't recommend any very good specific solution for this.
 3) a legitimate, unexpected private email.
    Pyzor tries to get enough data from a message to get a unique digest, 
    so this should be a rare occurence.

There is an ability to whitelist, but on the public server, only
authorized users have the ability to whitelist.  Nothing fancy.

-- 
Frank Tobin			http://www.neverending.org/~ftobin/