Research project. A Memory solution for users, teams, and applications
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
LLM inference in C/C++
High-speed Large Language Model Serving for Local Deployment
Open-source large language model family from Tencent Hunyuan
LLM training in simple, raw C/CUDA
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Mooncake is the serving platform for Kimi
TT-NN operator library, and TT-Metalium low level kernel programming
Alibaba's high-performance LLM inference engine for diverse apps
An Easy-to-Use and High-Performance AI Deployment Framework
Vector database plugin for Postgres, written in Rust