Menu

Ocr for fisrt page in multi page PDF file

2021-12-27
2021-12-30
  • Alfonso Vizcaino

    Hello

    When using PDF files with multiple pages, is there a way to specify which page i want to do OCR?

    Thanks

     

    Last edit: Alfonso Vizcaino 2021-12-27
  • Quan Nguyen

    Quan Nguyen - 2021-12-30

    No. The program will convert the input PDF to a multi-page TIFF image.

    What you can do is process the PDF before the OCR step, probably use PDFBox to extract a specified page, then convert that page to an image, and send it to tesseract engine.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.