A high-quality tool for convert PDF to Markdown and JSON
An Open-Source Toolkit for General-OCR Research and Applications
Get your documents ready for gen AI
Multilingual Document Layout Parsing in a Single Vision-Language Model
OCR software, free and offline
An on-premises, OCR-free unstructured data extraction
Contexts Optical Compression
Open source semantic search and text analytics for large document sets
A Repo For Document AI
Canvas-based WYSIWYG rich text editor with advanced layout tools
Library for OCR-related tasks powered by Deep Learning
Enhances Tesseract OCR output using LLMs (local or API)
Map location picker component for Android
OCR model for complex documents with layout-aware structured outputs
The SILE Typesetter — Simon’s Improved Layout Engine
Assist in organizing your piles of documents
Accurate × Fast × Comprehensive
Open-Source Python3 tool for recognizing layouts, tables, and math
OCR expert VLM powered by Hunyuan's native multimodal architecture
Video translation and dubbing tool powered by LLMs
Extract and convert data from any document, images, pdfs, word doc
Collabora Online is a collaborative online office suite
CLI tool to extract (meta)data from PDF and manipulate PDF files
PDF Parser for AI-ready data. Automate PDF accessibility
R Markdown Résumés and CVs