AI-powered document analysis and tagging for Paperless-ngx
Document (PDF, Word, PPTX ...) extraction and parse API
A Repo For Document AI
A Model Context Protocol (MCP) server implementation
A high-quality tool for convert PDF to Markdown and JSON
Get your documents ready for gen AI
Text mining using tidy tools
ExtractThinker is a Document Intelligence library for LLMs
Open source semantic search and text analytics for large document sets
A system for agentic LLM-powered data processing and ETL
An open source collaborative multi-agent OS
Document content and metadata extraction microservice
Application implementation with business use cases
LongBench v2 and LongBench (ACL 25'&24')
Unified framework for building enterprise RAG pipelines
Autonomous agents for everyone
Chat with your documents using local AI
Full-stack Open-source Self-Evolving General AI Agent
Question and Answer based on Anything
Multi-tool for semantic search
Research-oriented chatbot framework
Your fully private, open-source, on-device AI assistant
Semantic search and workflows for medical/scientific papers
Public repository for Agent Skills
Extract and convert data from any document, images, pdfs, word doc