A high-quality tool for convert PDF to Markdown and JSON
Get your documents ready for gen AI
Multilingual Document Layout Parsing in a Single Vision-Language Model
An on-premises, OCR-free unstructured data extraction
Contexts Optical Compression
Open source semantic search and text analytics for large document sets
A Repo For Document AI
OCR software, free and offline
Canvas-based WYSIWYG rich text editor with advanced layout tools
OCR model for complex documents with layout-aware structured outputs
Map location picker component for Android
The SILE Typesetter — Simon’s Improved Layout Engine
Assist in organizing your piles of documents
Enhances Tesseract OCR output using LLMs (local or API)
Library for OCR-related tasks powered by Deep Learning
Accurate × Fast × Comprehensive
Open-Source Python3 tool for recognizing layouts, tables, and math
R Markdown Résumés and CVs
OCR expert VLM powered by Hunyuan's native multimodal architecture
Extract and convert data from any document, images, pdfs, word doc
PDF Parser for AI-ready data. Automate PDF accessibility
Collabora Online is a collaborative online office suite
CLI tool to extract (meta)data from PDF and manipulate PDF files
Video translation and dubbing tool powered by LLMs
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning