Menu

OCR Accuracy

2023-05-26
2024-04-09
  • Cliff Naylor

    Cliff Naylor - 2023-05-26

    NAPS2 works great but the OCR could be more accurate. NAPS2 6.1.2 comes with Tesseract 4.0.0b4. Would upgrading to Tesseract 5.3.1 improve OCR accuracy? If yes, how do I get NAPS2 to work with Tesseract 5?

     
  • Ben Olden-Cooligan

    If you download the beta version it uses Tesseract 5.x but the improvements are mostly in being faster, not more accurate. I'm assuming you've tried both "Fast" and "Best" modes?

     
  • Cliff Naylor

    Cliff Naylor - 2023-05-27

    I have tried Fast and Best modes--they seem to be comparable. Both modes smosh together words ("if_anyone" becomes "ifanyone").
    The beta version of Tesseract still uses an old version of eng.traineddata from 2018. Would a newer version of eng.traineddata help with accuracy?

     
  • shortski

    shortski - 2024-03-22

    I regularly use NAPS2 for OCR and it usually meets my needs well. Since it didn't work too well on a screencap lately, I compared it with ABBYY's OCR, which comes with a free PDF editor I use. ABBYY's OCR gave me much better results. The image I was working with was a screenshot of some programming code, so not your usual scanned document, but the difference in quality was pretty obvious. Is this kind of difference to be expected? And could ABBYY's OCR be a good fit for NAPS2?

    UPDATE
    I initially thought NAPS2 was missing the first characters on each line because the text selector overlay looked off. But when I copied the text, it turned out all characters were captured. In fact, the OCR results from NAPS2 actually had fewer errors compared to ABBYY's, showing NAPS2 does a solid job after all. 🙂
    Running OCR on programming code is always a challenge btw, since it is full of special characters that make no sense to the engine, or most people for that matter.

     

    Last edit: shortski 2024-03-22
  • Ben Olden-Cooligan

    NAPS2 7.4.1 should have improved text alignment which will help with words running together.

     

Log in to post a comment.