OCR software, free and offline
Accurate × Fast × Comprehensive
Contexts Optical Compression
PDF to Markdown with vision models
Visual Causal Flow
Formula recognition based on LaTeX-OCR and ONNXRuntime
OCRmyPDF adds an OCR text layer to scanned PDF files
Awesome multilingual OCR toolkits based on PaddlePaddle
Enhances Tesseract OCR output using LLMs (local or API)
Library for OCR-related tasks powered by Deep Learning
A high-quality tool for convert PDF to Markdown and JSON
Ready-to-use OCR with 80+ supported languages
Multilingual Document Layout Parsing in a Single Vision-Language Model
Advanced language and coding AI model
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
PDF scientific paper translation with preserved formats
Convert AI papers to GUI
Math OCR model that outputs LaTeX and markdown
Get your documents ready for gen AI
A framework to enable multimodal models to operate a computer
Open Source Document Management System for Digital Archives
OCR model for complex documents with layout-aware structured outputs
OpenRecall is a fully open-source, privacy-first alternative
A simple tool for reading in poorly redacted documents
Document content and metadata extraction microservice