PDF-manuals are extremely big after using NAPS2console with OCR enabled (tesseract)
Scan documents to PDF and other file types, as simply as possible.
Brought to you by:
ben-cyanfish
Hi,
after using OCR (tesseract) with NAPS2, some PDF-manuals with only some megabytes size as input are extremely big as output with 1-2 Gigabyte size!!!
This problem does not occur with PDF24 which is also using tesseract for OCR. So I copied the tesseract-files from PDF24 to the NAPS2-components-folder and replaced the tesseract-version. But the problem still exists, so it is not a problem with the tesseract version but with NAPS itself.
Addition: I just found out that the problem also occurs by just converting to PDF/A. So it is a general problem with the Input-PDF-files which are already OCRed.
The problem occurs in the NAPS2 GUI as in NAPS2console. Is there a solution? I prefer a solution with NAPS2console.
Thanks!
Last edit: snugg 2021-01-09