From: Guido <lis...@gu...> - 2009-09-07 10:21:32
|
On (09-09-04 21:57), Alpana Weaver wrote: > What types of emails should be whitelisted? I assume that there is no > value in me whitelisting for example a personal email from my sister of > which I am the only recipient because no one else will receive that same > email? If that is correct, do we only whitelist hams that have many > recipients e.g. the weekly emails sent from > Mar...@mo... where recipients have elected to > subscribe to the emails? > I suppose my broader question is: how should spam and ham be defined for > the use of Pyzor? >From my point of view there are two reasons why someone would whitelist a message. The first (and may be the more important one) is if a spam mail has been correctly reported as spam. Unfortunately it may happen that the digest of the spam mail is not meaningful and therefore may match the digest of ham mails. Check this thread [1] for an example. Here I assume that a low false positive rate is more important than a high true positive rate. The second case is that someone reported (by accident or not) a ham message as spam. As it is not easy to remove reported messages from the server someone has to whitelist this message in order that it does not hit pyzor any more. This may be the case for bulk mails for example. Unfortunately it is very likely that these messages to your users already hit pyzor at the time when you whitelist it. As the messages from your sister should be unique there is no need to whitelist them. That's true. This would just generate futile entries in pyzords database. >From my point of view whitelisting messages from bulk senders that have not been reported yet, does not make sense too. At the time you find this message in you mailbox it is very likely that it is already delivered to all recipients. So a whitelisting entry in the database would be useless too. Hopefully the next month's message will differ from the actual message ;-) If you indeed get recurring messages with the same digest it may be a good idea to whitelist those. Just to prevent from accidental reporting. (For reporting would not have any affect after this.) Anyhow, I can't see any use case for this. There may be another reason for whitelisting messages. Assuming that pyzor checks are done _before_ the mail goes into spamassassin it could save a lot of system resources when a whitelisted message bypasses spamassin. But I am not sure if this acutally makes sense in real world. Good luck with your paper. Would like to read it after it's finished. [1] http://sourceforge.net/mailarchive/message.php?msg_name=1248706984.24454.17.camel%40werner |