Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


gocr performance

  • runholen

    I have a question about gocr performance.
    I'm trying to use gocr to automatically obtain invoice information.
    I have been informed that the invoices will all be in Courier12, which
    should closely resemble the ocr-b-standard.

    However, when I tried gocr on a test-image, I did not get good results.
    I wrote a test-image in Courier12 with the following text:
    This is a test!!!! ??? 1234567890 1 2 3 4 5 6 7 8 9 0
    55.60 57,10 kr abcdefghijklmnopqrstuvwxyz

    gocr gave the following output:
    Thls ls a testl I I I  111 1234567890 1 2 3 4 5 6 7 8 9 O
    55.60 57,10 kr abcdefghl_klmnopqrstu_xyx

    I am quite satisfied with the output for numbers, but when a total of
    5 characters are misinterpretated, plus the wrong symbols, it will be
    hard to do text-recognition.

    Can I somehow configure gocr to more closely interpret Courier12? And how do I do this?

    Runar Holen