Question regarding layout detection

A graphical frontend to tesseract-ocr

Brought to you by: sandromani

Question regarding layout detection

Forum: Help

Creator: naqib quarishi

Created: 2016-03-31

Updated: 2016-04-05

naqib quarishi - 2016-03-31

I have a question regarding layout detection. First I tried with image 1 (below)

Rotated it 180.

Increased contrast 100

Resolution 600

Did a detect layout and it detected the following.

Once the recognition it recognized 3080 SO
3080 SO

|O'—||ll

“I
Z
_I
I]

Onedirve Link provides a file with the images: http://1drv.ms/1q6qrWy

Question is, how can I train this so it can fix the SO to SC?

Next I tried same steps with another image (text is at an angle 45 degree).

Then I rotated it. Then did AutoDetect layout. Unfortunately, it detected the entire thing as one. And recognition is bad as well.
//
9,,

3/,

4 0 /
2......3. \

3/
‘ , 4‘ v
[9

My question is, what am I doing wrong? Or what is the right way? Can you please help / suggest?

Thank you again for this wonderful application.

Best regards,
Naqib
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandro Mani - 2016-04-01

Hi

gImageReader is merely a front-end to the OCR-Engine, tesseract, so for training and accuracy issues you will have to look at the help resources of tesseract itself. The best document to start with is probably

https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract

Looking at the images you are trying to recognize, I'd say it is pretty much expected that the default language definitions of tesseract won't be able to give you an accurate result, so yes, you will have to look at training tesseract appropriately by following the steps described in the wiki page above.

Hope this helps
Sandro

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

naqib quarishi - 2016-04-05

Dear Mr. Sandro,

First let me thank you for your response and for this excellent tool that you have built.

I have few questions that I would like to ask here.

Q1. Does the language definition facilitate the Layout detection somehow? You mentioned default language definitions of tesseract won’t give you an accurate result, but will that help Layout detection?

Q2. The problem is when I rotate it first and then load the image for layout detection (link below) it still detects the entire image.

Here is a sample of what I am talking about.

https://onedrive.live.com/redir?resid=135ACA0E27D2C575!45719&authkey=!APQQQFDlTRLCbV8&ithint=file%2cdocx

Q2. When I create the training images, do you think I need to feed in each digits with this each specific fonts as well as feed in these digits at various angles?

I really appreciate all your help in this matter.

Naqib.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sandro Mani - 2016-04-05

Hi Naqib

I'm not really familiar with the tesseract internals, but I assume that the language definitions to at least help tesseract guess the writing order of a block of text (i.e. left-to-right, right-to-left, etc)

Without specific training, I'd say that the many fine lines in the drawing can make it pretty hard for tesseract to figure out what is text and what not

Sorry, I actually have never gone through the training procedure, so I can't do much else than refer the help resources available on the internet

Best
Sandro
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.