The official repo of Qwen chat & pretrained large language model
A high-throughput and memory-efficient inference and serving engine
Mooncake is the serving platform for Kimi
AI-powered penetration testing assistant using local LLM on linux
High-speed Large Language Model Serving for Local Deployment
Real-time NVIDIA GPU dashboard
A Next-Generation Training Engine Built for Ultra-Large MoE Models
Low-code framework for building custom LLMs, neural networks
Fast and efficient unstructured data extraction
All-in-one AI companion! Desktop girlfriend + virtual streamer
MobileLLM Optimizing Sub-billion Parameter Language Models
High-performance inference framework for large language models
Serving multiple LoRA finetuned LLM as one