pdf to text free download

Showing 6 open source projects for "pdf to text"

View related business solutions

OCR Python Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 115 This Week

Last Update: 2026-03-21
See Project
2

Umi-OCR

OCR software, free and offline

Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. ...

Downloads: 45 This Week

Last Update: 2026-01-15
See Project
3

Papermerge

Open Source Document Management System for Digital Archives

...Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. OCR technology is vital part of Papermerge. It extracts text information from scanned documents, PDF, JPEG, TIFF files.

Downloads: 9 This Week

Last Update: 2025-07-24
See Project
4

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 5 This Week

Last Update: 2026-02-03
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

Paperless-ng

A supercharged version of paperless, scan, index and archive docs

Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...

Downloads: 0 This Week

Last Update: 2022-03-04
See Project
6

Linux-Intelligent-Ocr-Solution

Easy-OCR solution and Tesseract trainer for GNU/Linux

Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial : https://www.youtube.com/playlist?...

5 Reviews

Downloads: 18 This Week

Last Update: 2020-10-19
See Project