Research project. A Memory solution for users, teams, and applications
SimpleMem: Efficient Lifelong Memory for LLM Agents
ChatGLM-6B: An Open Bilingual Dialogue Language Model
MemoryOS is designed to provide a memory operating system
AirLLM 70B inference with single 4GB GPU
Accessible large language models via k-bit quantization for PyTorch
Neural Network architecture based on ideas of the original LSTM
A high-throughput and memory-efficient inference and serving engine
⚡ Building applications with LLMs through composability ⚡
Zep: A long-term memory store for LLM / Chatbot applications
The PHP Agentic Framework to build production-ready AI driven apps
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
157 models, 30 providers, one command to find what runs on hardware
A Simple and Universal Swarm Intelligence Engine
Redundancy-aware KV Cache Compression for Reasoning Models
Claude + Obsidian knowledge companion
Mooncake is the serving platform for Kimi
LangChain for Rust, the easiest way to write LLM-based programs
Real-time NVIDIA GPU dashboard
A Telegram bot for Large Language Models
State-of-the-art Parameter-Efficient Fine-Tuning
Maimaibot, a (more focused) multi-platform intelligent agent
Unified KV Cache Compression Methods for Auto-Regressive Models
A high-performance inference engine for AI models
High-speed Large Language Model Serving for Local Deployment