Re: [Dspam-user] Mysql connections in daemon mode
Brought to you by:
paulcockings,
sbajic
From: <to...@st...> - 2010-01-29 07:27:01
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/01/2010 02:05, Stevan Bajić wrote: > On Thu, 28 Jan 2010 14:07:31 +0100 > "to...@st..." <to...@st...> wrote: > >> yes, if email total score is above a fixed threshold, AND dspam >> doesnt agree with this, then email is retrained into dspam. >> It's the same mechanism than SA autolearn and, AFAIK, plugin crm114 >> use this to retrain crm. >> >> I know that will be introduce some possible mistakes, but i think >> balance between good and right will be ok. >> What's your opinion on this ? should i try to do this, or it's better >> to let DSPAM learn by himself and manually retrain it on error ? >> > This all depends on your needs. Do your users train DSPAM, CRM114 and/or SA? > > Or to put it the other way around: If you trust so much SA then why use CRM114 and DSPAM at all? What is the point? SA has a bayes engine it self so there is no much benefit in using DSPAM and/or CRM114 (with it's default setup/configuration). > > Why do you use 3 Anti-Spam engines when each of them depend on each other and one of them can drag the accuracy of all the others down? What is the reason that you use a heuristic engine like SA and two statistical like CRM114 and DSPAM? > > Could you write a little bit about how your users are using the Anti-Spam system? Do they train the engine? Are they able/allowed to train? Does every user has his own data set or do they all share the same data for ham/spam? What MTA do you use? > our users are able to train dspam, crm114 and SA. They share the same dateset. We use postfix as global MTA, but we dont use it to retraining. (no special alias) In order to retrain FP, our customers can move email into 2 imap folders in their mailbox, one for spam learning, the other for ham learning. it feeds 2 special folders on one centralized server from which we can apply learning scripts. This script do sa-learn for SA and for DSPAM, it checks email headers and if dspam is not agree with classification, email is retrained with command: /usr/bin/dspam --client --user amavis --class=spam --source=error (or class=ham of course) This retraining increase greatly accuracy of the 3 engines. Autolearning is more tricky because it will massively rely on heuristics engine (main scoring) to adjusts statistical engine (SA bayes, CRM) on the fly. But i'm agree with you, what's the point to use the 3 statisticals engine this way. For SA, it's OK, but for CRM114 and DSPAM, I'm wonder if it's really clever. So I think i will let DSPAM do his job, and continue use his scoring to balance the others. It's the way it works actually, and I'm really satisfied: accuracy is great and FP are very low. And may be I will do the same with CRM114. So I will give it a try to dspam plugin at http://eric.lubow.org/projects/dspam-spamassassin-plugin/ because, if i'm understand correctly, it can be used to balance scoring more precisely. Thanks for your help on this Regards, Tonio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktijbMACgkQ8FtMlUNHQIOSOwCaAqfbx+fcmBAUy7mCFFzjb4Ys wdcAn2433ELBLnRGYiuSQnLjCy8LFz7z =05gq -----END PGP SIGNATURE----- |