Distribute and run LLMs with a single file
Mooncake is the serving platform for Kimi
TT-NN operator library, and TT-Metalium low level kernel programming
A @ClickHouse fork that supports high-performance vector search
UCCL is an efficient communication library for GPUs
High-speed Large Language Model Serving for Local Deployment
Production ready toolkit to run AI locally
Implements a reference architecture for creating information systems