A high-throughput and memory-efficient inference and serving engine
A lightweight vLLM implementation built from scratch
System Level Intelligent Router for Mixture-of-Models at Cloud
Personal AI, On Personal Devices
Visual Causal Flow
Private Open AI on Kubernetes
A unified library of SOTA model optimization techniques
Moonshot's most powerful AI model
Run a full local LLM stack with one command using Docker
NVIDIA plugin for secure installation of OpenClaw
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Towards Human-Sounding Speech
Accelerate local LLM inference and finetuning
From Vibe Coding to Agentic Engineering
Interface for OuteTTS models
The free, Open Source alternative to OpenAI, Claude and others
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs
Advanced language and coding AI model
Qwen3 is the large language model series developed by Qwen team
Open source AI IDE and Cursor alternative
Accurate × Fast × Comprehensive
Ultra-Efficient LLMs on End Device
A proxy server for multiple ollama instances with Key security
Multilingual Document Layout Parsing in a Single Vision-Language Model
Open-source large language model family from Tencent Hunyuan