Menu

VietORC

Help
Miao
2014-09-05
2014-09-10
  • Miao

    Miao - 2014-09-05

    Hi,
    There is an image that use font Unicode and was italicized so I cannot convert it to text. Is there anyway that I can add new font to VietORC?

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-05

    Generally, all the standard .traineddata files include italic font style. If need be, you can train Tesseract and add the generated .traineddata to VietOCR's tessdata folder.

     
  • Miao

    Miao - 2014-09-06

    Thank you but I'm confused and really do not understand much about it. Can you give me more details?

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-06

    Please be more specific. What are you confused about? Please attach your image, if possible.

    You can use jTessBoxEditor to assist you with the training.

     
  • Miao

    Miao - 2014-09-09

    I tried to convert this image but vietorc does not work.

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-09

    You need to scan your image better: 300 DPI, TIFF or PNG image format, not JPEG.

    And that font is not supported. You'll need to train Tesseract for it. What's the name of the font?

     
  • Miao

    Miao - 2014-09-10

    I'm not sure which font is it. Can you teach me how to train Tesseract for a font?

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-10

    Put in the effort to look around and find out what font it is. It will help you create the TIFF/Box files used in training.

    The training procedure was already mentioned in previous posts -- you'll need to read through it.

     

Log in to post a comment.