OCR software, free and offline
OCRmyPDF adds an OCR text layer to scanned PDF files
A high-quality tool for convert PDF to Markdown and JSON
Visual Causal Flow
Contexts Optical Compression
Accurate × Fast × Comprehensive
Get your documents ready for gen AI
Awesome multilingual OCR toolkits based on PaddlePaddle
Open Source Document Management System for Digital Archives
Enhances Tesseract OCR output using LLMs (local or API)
A Repo For Document AI
Library for OCR-related tasks powered by Deep Learning
Ready-to-use OCR with 80+ supported languages
A community-supported supercharged version of paperless
Document content and metadata extraction microservice
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Multilingual Document Layout Parsing in a Single Vision-Language Model
Convert AI papers to GUI
A framework to enable multimodal models to operate a computer
OpenRecall is a fully open-source, privacy-first alternative
A Python application to add watermarks (text or image) to PDF files
OCR model for complex documents with layout-aware structured outputs
Structured data extraction and instruction calling with ML, LLM
A high-quality PDF to Markdown tool based on large language model
AI-powered tool for efficient abstract and PDF screening