OCRmyPDF

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Features

Generates a searchable PDF/A file from a regular PDF
Places OCR text accurately below the image to ease copy / paste
Keeps the exact resolution of the original embedded images
When possible, inserts OCR information as a "lossless" operation without disrupting any other content
Optimizes PDF images, often producing files smaller than the input file
If requested, deskews and/or cleans the image before performing OCR
Distributes work across all available CPU cores

Project Samples

Project Activity

See All Activity >

License

Mozilla Public License 1.0 (MPL)

Follow OCRmyPDF

OCRmyPDF Web Site

Other Useful Business Software

$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial

Rate This Project

User Reviews

Be the first to post a review of OCRmyPDF!

Additional Project Details

Programming Language

Python

Related Categories

Python PDF Software, Python OCR Software

Registered

2023-11-17

Similar Business Software

MobiPDF (formerly PDF Extra)

MobiPDF (formerly PDF Extra) is an intuitive and powerful PDF editor and reader designed for today’s modern user - the cost-efficient alternative to Adobe Acrobat Pro you’ve been looking for. FEATURES OVERVIEW: PDF Viewer and Reader: Switch between page views or use "Read Mode" for...

See Software
Nutrient SDK

Nutrient is the comprehensive solution for all your PDF needs, offering tools that effortlessly integrate and operate PDF functionality across any platform. 1. SDK PRODUCTS Integrate robust PDF functionality into iOS, Android, Windows, web (JavaScript), or any cross-platform technology,...

See Software
PDF Guru

PDF Guru is an online PDF tool that enables users to convert, edit, merge, and compress PDFs with just a few clicks. Download your document quickly and with ease. Our PDF converter can create PDF files in seconds! We use HTTPS encryption to protect your digital data when you make PDF files...

See Software
Cisdem PDF Converter OCR

Cisdem PDF Converter OCR is your all-in-one solution for converting PDFs into editable formats while preserving original layouts. With advanced OCR technology, it can also accurately recognizes text from scanned documents and images—making it the perfect tool for professionals, students, and...

See Software
FlexiPDF

Editing PDFs has never been so easy. Edit PDFs as easily as with a word processor. Have you ever wanted to edit the text of a PDF? Insert or replace images in a PDF file? Convert scanned pages to editable documents? With FlexiPDF, you can! Creating, editing and commenting in PDF files is just as...

See Software
Foxit Document Workflow APIs

Foxit provides a powerful suite of cloud-native APIs that help organizations automate, secure, and modernize document workflows. Built on scalable REST architecture, Foxit APIs enable developers to generate, convert, extract, sign, and display documents directly within applications—eliminating...

See Software