Minimal Python framework for scalable AI inference servers fast
High-performance inference server for text embeddings models API layer
Mastering Applied AI, One Concept at a Time
Ready-to-run cloud templates for RAG
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
Making RAG Simpler with Small and Open-Sourced Language Models
SimpleMem: Efficient Lifelong Memory for LLM Agents
A New Axis of Sparsity for Large Language Models
Knowledge Graph Generation from Any Text
Kimi Code CLI is your next CLI agent
Build production-ready AI agents in both Python and Typescript
Low-latency AI inference engine optimized for mobile devices
Ship AI Agents to Google Cloud in minutes, not months
AI-powered document analysis and tagging for Paperless-ngx
Local RAG engine for private multimodal knowledge search on devices
A collection of scientific methods, processes, algorithms
Learning to Reason with Search for LLMs via Reinforcement Learning
Traditional Mandarin LLMs for Taiwan
Data Infrastructure providing an approach to multimodal AI workloads
An Efficient Web-enhanced Question Answering System
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
In-depth tutorials on LLMs, RAGs and real-world AI agent applications
Advanced techniques for RAG systems
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
The collaborative spreadsheet for AI