I know Frank has plans to introduce Bayes-rule based spam filtering (BSF) into
Right now I am personally playing with a bayes spam filter (in my case -
spamoracle). I must say that it is much more powerful than SpamAssassin.
However, it has some drawbacks. For example, I do not receive HTMLed mails
frequently enough for a filter to learn they are OK. (I guess the filter has
correctly learned very high probability of spamness if mail has fancy HTML).
I argue that Pyzor doesn't need Bayes rules, at least in the form which such
filters are implemented today, because it is easy to chain two filters and get
the same effect.
What I think is suitable for Pyzor is to use the approach BSF use to pick
specific words. I think characteristic words could be used to be included into
digest somehow. The problem is that words database is moving target:
it changes slightly with each spam/nonspam addition.
These things need further research (MSc/PhD thesis, anyone?),
but if implemented they will give Pyzor more adaptivity
than static digest approach used today.
Sincerely yours, Roman Suzi
rnd@... =\= My AI powered by Linux RedHat 7.3