Fast stable diffusion on CPU and AI PC
Fast inference engine for Transformer models
Port of OpenAI's Whisper model in C/C++
High-speed Large Language Model Serving for Local Deployment
A system monitoring tool that exposes system metrics
Running large language models on a single GPU
Ultra-Efficient AI Assistant in Go
Real-time NVIDIA GPU dashboard
Easy-to-use deep learning framework with 3 key features
LLM inference in C/C++
A high-quality rapid TTS voice cloning model
Claude Code plugin that automatically captures everything Claude does
Python-free Rust inference server
AirLLM 70B inference with single 4GB GPU
Fastest, smallest, and fully autonomous AI assistant infrastructure
The best way to use Hermes Agent from the web or from your phone
A modern model graph visualizer and debugger
Building an Intelligent Agent from Scratch
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Accessible large language models via k-bit quantization for PyTorch
A TTS that fits in your CPU (and pocket)
State-of-the-art TTS model under 25MB
A personal AI assistant, easy to install
Fast and accurate AI powered file content types detection
Official inference framework for 1-bit LLMs