From: Ken A. <kan...@bb...> - 2002-10-24 20:02:45
|
I implemented Paul Graham's spam filter: http://www.paulgraham.com/spam.html. I'm impressed at how simple (< 160 lines) and effective it is. For training i used 290 spams and the first 500 messages in my inbox (which includes spam, though only 2 in the first 500. I then ran it on the 3200 messages in my inbox and it found 313 spams. There was one border line false positive - a Java Development bulk mailing. Not bad for 2 - 3 hours work! The code is Eudora specific, but you should be able to tailor it to your mail reader easily. It can classify about 10 emails/sec on my machine, which is plenty fast enough for playing around. Thanks, Paul! k |