OCR software, free and offline
Contexts Optical Compression
Accurate × Fast × Comprehensive
Visual Causal Flow
OCRmyPDF adds an OCR text layer to scanned PDF files
Enhances Tesseract OCR output using LLMs (local or API)
Awesome multilingual OCR toolkits based on PaddlePaddle
OCR expert VLM powered by Hunyuan's native multimodal architecture
A high-quality tool for convert PDF to Markdown and JSON
Multilingual Document Layout Parsing in a Single Vision-Language Model
Convert AI papers to GUI
A framework to enable multimodal models to operate a computer
Get your documents ready for gen AI
Open Source Document Management System for Digital Archives
OpenRecall is a fully open-source, privacy-first alternative
A Repo For Document AI
OCR model for complex documents with layout-aware structured outputs
Document content and metadata extraction microservice
Structured data extraction and instruction calling with ML, LLM
A community-supported supercharged version of paperless
AI tool for automating desktop tasks via natural language input
Handwritten Text Recognition (HTR) system implemented with TensorFlow
In-depth tutorials on LLMs, RAGs and real-world AI agent applications
An on-premises, OCR-free unstructured data extraction
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming