Contexts Optical Compression
Accurate × Fast × Comprehensive
PDF to Markdown with vision models
Visual Causal Flow
Awesome multilingual OCR toolkits based on PaddlePaddle
Convert AI papers to GUI
A framework to enable multimodal models to operate a computer
In-depth tutorials on LLMs, RAGs and real-world AI agent applications
Enhances Tesseract OCR output using LLMs (local or API)
OCR expert VLM powered by Hunyuan's native multimodal architecture
PDF scientific paper translation with preserved formats
Screenshots, word marking, OCR, AI, translation software
Get your documents ready for gen AI
Use LLMs and LLM Vision (OCR) to handle paperless-ngx
PDF Parser for AI-ready data. Automate PDF accessibility
Readest is a modern, feature-rich ebook reader
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
A Repo For Document AI
OpenRecall is a fully open-source, privacy-first alternative
A simple tool for reading in poorly redacted documents
Streamline your life using PromptingTools.jl
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Document content and metadata extraction microservice
A self-hostable bookmark-everything app
Doctor Dok is an AI based medical data framework