From: John H. <jh...@cp...> - 2003-01-16 21:43:03
|
The question came up how ASSP would function in a homogenous user base, for a small ISP for example. One immediate result is that the whitelist would become quite large. This in itself is a good thing. Even if spammers inadvertantly get added, very few spam are ever sent from the same from address twice, so the whitelist alone might be an effective anti-spam tool. Imagine if Hotmail collected all addresses mail to from hotmail accounts for a month. Probably 10 to 100 million. Any email that comes from an address not on the whitelist is immediately suspect, though that in itself wouldn't make it spam. However the auto-adjusting Bayesian filter would probably not work well in that environment. One of the main reasons is that Hotmail delivers as much spam as it receives. So in Hotmail's case one would probably want to work out another method for choosing spam and non-spam collections to train the Bayesian filter. Of course the whitelist would be on the order of 4 gigs... Anyway this line of thought needs more development. What do you think? John |