A community-supported supercharged version of paperless
Open Source Document Management System for Digital Archives
Python tool for converting files and office documents to Markdown
An open-source RAG-based tool for chatting with your documents
Interact with your documents using the power of GPT
Generate audiobooks from EPUBs, PDFs and text with captions
A full spaCy pipeline and models for scientific/biomedical documents
An AI personal assistant for your digital brain
A Repo For Document AI
Library for OCR-related tasks powered by Deep Learning
ktrain is a Python library that makes deep learning AI more accessible
The ChatGPT Retrieval Plugin lets you easily find personal documents
Python scraper based on AI
ContextGem: Effortless LLM extraction from documents
OCRmyPDF adds an OCR text layer to scanned PDF files
Open source libraries and APIs to build custom preprocessing pipelines
File Parser optimised for LLM Ingestion with no loss
Contexts Optical Compression
Haystack is an open source NLP framework to interact with your data
Qwen3 is the large language model series developed by Qwen team
Build AI-powered semantic search applications
Neural Search
OCR expert VLM powered by Hunyuan's native multimodal architecture
A Telegram RSS bot that cares about your reading experience
Tongyi Deep Research, the Leading Open-source Deep Research Agent