AI-powered document analysis and tagging for Paperless-ngx
Document (PDF, Word, PPTX ...) extraction and parse API
A Repo For Document AI
A Model Context Protocol (MCP) server implementation
A high-quality tool for convert PDF to Markdown and JSON
Get your documents ready for gen AI
ExtractThinker is a Document Intelligence library for LLMs
A system for agentic LLM-powered data processing and ETL
Document content and metadata extraction microservice
LongBench v2 and LongBench (ACL 25'&24')
Unified framework for building enterprise RAG pipelines
Chat with your documents using local AI
Question and Answer based on Anything
Multi-tool for semantic search
Research-oriented chatbot framework
Semantic search and workflows for medical/scientific papers
Public repository for Agent Skills
Private chat with local GPT with document, images, video, etc.
Topic Modelling for Humans
ContextGem: Effortless LLM extraction from documents
Running large language models on a single GPU
Open-Source Financial Large Language Models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open source healthcare AI
Open source NLP guide with models, methods, and real use cases