A Repo For Document AI
AI-powered document analysis and tagging for Paperless-ngx
Document (PDF, Word, PPTX ...) extraction and parse API
Get your documents ready for gen AI
A high-quality tool for convert PDF to Markdown and JSON
Open source semantic search and text analytics for large document sets
Text mining using tidy tools
A Model Context Protocol (MCP) server implementation
ExtractThinker is a Document Intelligence library for LLMs
PHP low-level client for Elasticsearch
Document content and metadata extraction microservice
Private chat with local GPT with document, images, video, etc.
RAG-Anything: All-in-One RAG Framework
LongBench v2 and LongBench (ACL 25'&24')
A system for agentic LLM-powered data processing and ETL
Autonomous agents for everyone
Multi-tool for semantic search
Full-stack Open-source Self-Evolving General AI Agent
Chat with your documents using local AI
Open-Source Financial Large Language Models
Public repository for Agent Skills
Clean network diagrams, One-time setup, zero upkeep
Research-oriented chatbot framework
Optimized Workforce Learning for General Multi-Agent Assistance
Extract and convert data from any document, images, pdfs, word doc