Feature request: pass custom options to tesseract

gerlos
2013-05-04
2013-06-20
  • gerlos

    gerlos - 2013-05-04

    Hello,
    Many thanks for your great work on gImageReader! You saved my life!

    I am using you software to digitize historical scientific data, that we have in print and now need in spreadsheets. gImageReader is helping me a lot, but it could be more helpful if we could manually specify (eg. in config windows) some additional options to pass to tesseract.

    Since I'm scanning tables of digits, I'd like to add the option "-psm 5", as well "outputbase digits".
    Is it possible to add such feature?

    So far I looked in the source and manually added them to the subprocess.Popen command in main.py, but it would be nice if we could do it withoud such hack.

    thanks
    gerlos

     
  • Sandro Mani

    Sandro Mani - 2013-06-20

    Hello and sorry for the very late answer, I must have missed the mail in my mailbox. So, since I've rewritten gImageReader to use the tesseract C++ API directly (you can find the current version in the git repo here on the sourceforge page), the program doesn't call tesseract via command line anymore. However, one alternative I see is to support a config file, such as described here [1]. Would that work?

    Best,
    Sandro

    [1] http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks