Mooncake is the serving platform for Kimi
An Easy-to-Use and High-Performance AI Deployment Framework
Alibaba's high-performance LLM inference engine for diverse apps
Production ready toolkit to run AI locally
A @ClickHouse fork that supports high-performance vector search
UCCL is an efficient communication library for GPUs
Fast Multimodal LLM on Mobile Devices
High-speed Large Language Model Serving for Local Deployment
Locally run an Instruction-Tuned Chat-Style LLM
Implements a reference architecture for creating information systems