> I set up spamprobe four or five years ago, using BDB, arranging things so
> that it put detected spam in the spam mailbox, cleaned up sp_words once a
> week or so, and dealt daily with anything I had filed as false positive
> (once in a blue moon) or false negative (one a day, maybe). I don't remember
> how to set it up - high praise for a piece of software :-)
I don't even remember when I set up spamprobe: it's about ten years that
I have moved to Linux [from FreeBSD] - perhaps I am using spamprobe since
six-seven-eight years.
I am *not a fan* of BDB [I use or CDBs or PostgreSQL if I am given the
choice]: I believe, not sure, to have started with spamprobe and BDB ... have,
if yes, in any case moved to PBL early.
I have nothing to complain against PBL, except that it seems to me that it's
no longer maintained. Hence I have switched to the hash format one year ago,
and I am presently so much satisfied with spamprobe and the hash format that I
will not move away from the latter combination.
I can't remember to have seen a false positive with spamprobe--which is VERY
GOOD to me: I prefer anytime to have a couple of false negatives extra than to
get a single false positive.
It's easier and much faster for the eye to discriminate *bad apples* in a
context of nearly all *good apples* than the opposite: to look for one-two
*good apples* perhaps in a context of bad ones. [See also: cognitive
dissonance.]
It's just a matter of 2-3-4 seconds to put a glance at a list of subjects from
a dir [I use Maildir] of '... spam.sanitized' ... provided you are virtually
certain that there are no false positives in it. I never need to watch a body
from such a dir, merely looking at the titles and the senders confirms that
it's--indeed--spam. (And jobs in the O( hours ) [which keep the list slim
moreover] do the rest.)
I am currently not seeing false negatives since three-four-five days by the
way [above 200 emails a day]. My false negatives belong nearly always to these
categories:
** charset=koi8-r, iso-2022-jp and the like.
** *seasonal* [periodic] one-liners [series of "watch your mom naked,"
"Shakira video" and similar, with just "download". an URL and one,
maximum two fuzzy words, concatenated to 'download', or not, in the
body.
** nasty tricks: such fake email failure notifications, or mailing list
style adoption [mainly in the subjects], so to let think it's some
white-list stranded email.
The Martian charsets are filtered trivially [I don't expect any emails with
such charsets] ... so far; and don't let those emails pollute/waste my
DB/hash table. Maybe there could be a *native/direct option* for spamprobe:
--martians= ... or equivalent.
Cheers,
/Roy
--
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS kalau tak ingin terlimbur pasang,
SSSSS . s l a c k w a r e SSSSSS jangan berumah di tepi laut
SSSSS +------------ linux SSSSSS if you don't want to get flooded,
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS don't build a house next to the sea
|