Menu

Question regarding layout detection

Help
2016-03-31
2016-04-05
  • naqib quarishi

    naqib quarishi - 2016-03-31

    I have a question regarding layout detection. First I tried with image 1 (below)

    1. Rotated it 180.
    2. Increased contrast 100
    3. Resolution 600
    4. Did a detect layout and it detected the following.
    5. Once the recognition it recognized 3080 SO
      3080 SO

    |O'—||ll

    “I
    Z
    _I
    I]

    Onedirve Link provides a file with the images: http://1drv.ms/1q6qrWy

    Question is, how can I train this so it can fix the SO to SC?

    Next I tried same steps with another image (text is at an angle 45 degree).

    Then I rotated it. Then did AutoDetect layout. Unfortunately, it detected the entire thing as one. And recognition is bad as well.
    //
    9,,

    3/,

    4 0 /
    2......3. \

    3/
    ‘ , 4‘ v
    [9

    My question is, what am I doing wrong? Or what is the right way? Can you please help / suggest?

    Thank you again for this wonderful application.

    Best regards,
    Naqib

     
  • Sandro Mani

    Sandro Mani - 2016-04-01

    Hi

    gImageReader is merely a front-end to the OCR-Engine, tesseract, so for training and accuracy issues you will have to look at the help resources of tesseract itself. The best document to start with is probably

    https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract

    Looking at the images you are trying to recognize, I'd say it is pretty much expected that the default language definitions of tesseract won't be able to give you an accurate result, so yes, you will have to look at training tesseract appropriately by following the steps described in the wiki page above.

    Hope this helps
    Sandro

     
  • naqib quarishi

    naqib quarishi - 2016-04-05

    Dear Mr. Sandro,

    First let me thank you for your response and for this excellent tool that you have built.

    I have few questions that I would like to ask here.

    Q1. Does the language definition facilitate the Layout detection somehow? You mentioned default language definitions of tesseract won’t give you an accurate result, but will that help Layout detection?

    Q2. The problem is when I rotate it first and then load the image for layout detection (link below) it still detects the entire image.

    Here is a sample of what I am talking about.

    https://onedrive.live.com/redir?resid=135ACA0E27D2C575!45719&authkey=!APQQQFDlTRLCbV8&ithint=file%2cdocx

    Q2. When I create the training images, do you think I need to feed in each digits with this each specific fonts as well as feed in these digits at various angles?

    I really appreciate all your help in this matter.

    Naqib.

     
  • Sandro Mani

    Sandro Mani - 2016-04-05

    Hi Naqib

    1. I'm not really familiar with the tesseract internals, but I assume that the language definitions to at least help tesseract guess the writing order of a block of text (i.e. left-to-right, right-to-left, etc)
    2. Without specific training, I'd say that the many fine lines in the drawing can make it pretty hard for tesseract to figure out what is text and what not
    3. Sorry, I actually have never gone through the training procedure, so I can't do much else than refer the help resources available on the internet

    Best
    Sandro

     

Log in to post a comment.

MongoDB Logo MongoDB