OCR software, free and offline
Contexts Optical Compression
PDF to Markdown with vision models
OCRmyPDF adds an OCR text layer to scanned PDF files
Formula recognition based on LaTeX-OCR and ONNXRuntime
Awesome multilingual OCR toolkits based on PaddlePaddle
OCR expert VLM powered by Hunyuan's native multimodal architecture
Ready-to-use OCR with 80+ supported languages
Library for OCR-related tasks powered by Deep Learning
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
A high-quality tool for convert PDF to Markdown and JSON
Convert AI papers to GUI
PDF scientific paper translation with preserved formats
Open Source Document Management System for Digital Archives
A framework to enable multimodal models to operate a computer
Math OCR model that outputs LaTeX and markdown
A Repo For Document AI
A community-supported supercharged version of paperless
Vision utilities for web interaction agents
Qwen3-omni is a natively end-to-end, omni-modal LLM
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A Python application to add watermarks (text or image) to PDF files
FaceOnLive Open KYC: Streamlining Identity Verification with AI
Implementation of Nougat Neural Optical Understanding