A system for agentic LLM-powered data processing and ETL
Multi-tool for semantic search
Automate the management and configuration of infrastructures at scale
Unified framework for building enterprise RAG pipelines
The official implementation of RAPTOR
Code repository for PDFStitcher, a utility to stitch together PDFs
Chinese version of Google open source project style guide
ID-based RAG FastAPI: Integration with Langchain and PostgreSQL
Generate audiobooks from EPUBs, PDFs and text with captions
Accurate × Fast × Comprehensive
A community-supported supercharged version of paperless
DeepCode: Open Agentic Coding
Python bindings for MuPDF's rendering library.
Fully featured framework for fast, easy and documented API development
Edit PDF files with Nano Banana
Library for OCR-related tasks powered by Deep Learning
Public repository for Agent Skills
BISHENG is an open LLM devops platform for next generation apps
Document Index for Vectorless, Reasoning-based RAG
LongBench v2 and LongBench (ACL 25'&24')
A Heterogeneous Benchmark for Information Retrieval
Interact with your documents using the power of GPT
Research-oriented chatbot framework
A Python SOAP client
ExtractThinker is a Document Intelligence library for LLMs