Menu

Ability to edit/correct OCR'ed text before saving PDF would be awesome!

2018-02-01
2018-02-15
  • Mark Besonen

    Mark Besonen - 2018-02-01

    I've been using the OCR portion of NAPS2 recently for some of my scanned docs, and in general, it works great. But once in a while--usually due to a poor original--the OCR'ed text contains quite a few errors. What would be wonderful is if there were a way to edit/correct the OCR'd text before it gets merged into the final PDF.

    I recognize that NAPS2 is meant to be as simple as possible so even if it were possible to add a step for correction of OCR'ed text, maybe it does not fit into the grand scheme of things? But I won't know unless I ask so here's to hoping that it can be done!

     
  • Alan Jones

    Alan Jones - 2018-02-02

    I don't think you are going to get PDF editing out of NAPS2. The author has done a great job of features, but that seems out of scope. I personally have struggled to find good open source software to edit PDFs. Libreoffice can do it sometiems for short PDFs, but still hit walls and long ones it will choke.

    Scribus 1.6, looks like it will have many improvments in that area, but not sure when it will come out. Development on it has really stalled.

    Anyone with any other good PDF Editing tools?

     
  • Mark Besonen

    Mark Besonen - 2018-02-11

    Thanks, Alan, for your comments. I did not realize that LibreOffice could edit the OCR'ed text within a PDF file--thanks for mentioning this! I'll give it a shot, and see how things go. But like you indicated, I can imagine this is good for mostly short documents, and not large ones. For certain, a nice, open-source PDF editing tool would be wonderful!

     
  • Alan Jones

    Alan Jones - 2018-02-15

    Mark, in my own recent tests of a few odd documents there may be gaps in the OCR and that may lead to some interesting "chunks" of text and no text (just image of text). LibreOffice is OK for PDFs, but often struggles with larges ones. If you or someone else finds a bettr tool please share.

     

Log in to post a comment.

MongoDB Logo MongoDB