[Jocr-devels] urgent OCR project idea with potential cash support
Status: Alpha
Brought to you by:
joerg10
|
From: Matt W. <mc_...@ya...> - 2006-10-31 20:09:15
|
OCR has role to play in current US election cycle. There is serious concern that at least one Senatorial Candidate (Joe Lieberman) has campaign expenditures that *might* be fraudulent. The data to show that it is at minimum unusual are publicly available and filed on standard forms but in scanned .pdf or gif images only. (An example of the type of form I would like to parse and scan is here http://images.nictusa.com/showimg/14509.gif). Given that there are thousands of pages, the effort to process these forms, extract the data into an analyzable database in an unbiased fashion is all but impossible to accomplish manually at this point. Thus my hope for OCR. I would like to be able to pull out data from the form fields and compile it into a database. (The end result would be to compare line level expenditures of Senate candidates. At least one candidate has what looks to be a very unusual pattern). Anyone have any recommendations? Any recommendations you have for using GOCR or any other product, commercial or otherwise, would be greatly appreciated. I'm sure I could arrange for significant financial support for the project if this job could be done in a couple days. Tall order I know, but it would definitely create giant media buzz in the US and around the world for GOCR. Sincerely, Matt Williamson ____________________________________________________________________________________ Low, Low, Low Rates! Check out Yahoo! Messenger's cheap PC-to-Phone call rates (http://voice.yahoo.com) |