Re: [Jocr-devels] Image Spam detection using gocr
Status: Alpha
Brought to you by:
joerg10
|
From: Stephen T. <st...@ne...> - 2006-11-16 22:00:05
|
On 2006-11-16, Todd Lyons wrote: > On Thu, Nov 16, 2006 at 05:46:27PM +1000, Stephen Thorne wrote: > > >> > I'm working on some techniques to detect image spam by using gocr on > >> > images attached to emails. You may or may not be familiar with this > >> > style of spam, they often advertise stockscams or medications. > >> Before you go any further and reinvent the wheel, did you look at > >> FuzzyOCR ? http://fuzzyocr.own-hero.net/ > >Yep, I used quite a few of the ideas in the FuzzyOCR spamassassin > >plugin, but because of constraints of the environment where my code must > >run, I cannot use FuzzyOCR directly. > > What are those constraints? Linux mail server environment with a python based mail scanner. I'm extending the email scanner to do the ocr based filtering. > >I read the code thoroughly, but I'm not a perl programmer, I don't > >believe it did any cleaning of the image. > > What do you mean when you say "clean" the image? Despeckling and removal of colour noise. -- Regards, Stephen Thorne Development Engineer Scanned by the NetBox from NetBox Blue (http://netboxblue.com/) |