[Jocr-devels] Request for an overview
Status: Alpha
Brought to you by:
joerg10
|
From: Ian Carr-de A. <av...@ca...> - 2006-11-15 12:03:13
|
I've thought OCR would be interesting for a long time and I now also have a need to read some scanned documents, so it seems a good idea to put the motivation to use. My aims would be a) to link gocr to my favorite language (pike http://pike.roxen.com/) both for easy passing of images to gocr and for writing modules for image preprocessing and character recognition. b) to get it to deal better with the texts I have. Some combinations of letters are not recognised eg "Th" which probably is due to the gap being too small and suggests to me that these combinations need to be added as if they were a single character. I have quite a lot of texts with spaces with a line for people to write answers on and it would be nice to get a series of underscores. Also recognition of underlined text. The recognition technique described works from the topology of various letters. I wonder whether a cache of recognised letter images could be simply XORed to find a match. Obviously in the extreme case of the bitmaps being identical the result must logicly be the same as it was previously. This should certainly work well with texts from grabbed from the screen, which I'm also interested in, and with a connection to a scripting language which can control the scanner, I could scan or rescan at high definition to find a predictable origin for the image of a letter and then resample down, which may reduce differences between the image of the same letter depending on movements of less than one pixel. Some small and not clustered differences would lower the confidence only slightly, but a cluster of diferences may indicate letters with accents etc. If I get no exact match, but it looks like "h" with something at the bottom, "o" with something at the bottom then "w" with something at the bottom, probably it is "how" underlined. So I may get underlined text like this and not need to add underlined versions of each character. Maybe there is a big problem with this technique in practice, I'd be pleased is anyone who knows one could tell me. Clearly to do this I need to be able to have a module which can learn by receiving both the image to make an attempt at itself and receive the final character decided upon by other techniques. I see that the delared aim is to move to a gocr built around libgocr, but I also see that new gocr versions are released quite often and there is no new libgocr. I'm not sure whether I should look at the separately downloadable libgocr, or the files in gocr-0.41 api. Also I see references in gocr-0.41/api/doc/api.txt to an MDK, but can't see one to download on sourceforge. Could someone please point me in the direction of the sourcecode and documentation I should start with. Yours Ian |