Fast stable diffusion on CPU and AI PC
Fast inference engine for Transformer models
Port of OpenAI's Whisper model in C/C++
High-speed Large Language Model Serving for Local Deployment
A system monitoring tool that exposes system metrics
Easy-to-use deep learning framework with 3 key features
Running large language models on a single GPU
Ultra-Efficient AI Assistant in Go
Real-time NVIDIA GPU dashboard
Python-free Rust inference server
A high-quality rapid TTS voice cloning model
Claude Code plugin that automatically captures everything Claude does
AirLLM 70B inference with single 4GB GPU
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A modern model graph visualizer and debugger
Building an Intelligent Agent from Scratch
The best way to use Hermes Agent from the web or from your phone
A TTS that fits in your CPU (and pocket)
Fastest, smallest, and fully autonomous AI assistant infrastructure
State-of-the-art TTS model under 25MB
Fast and accurate AI powered file content types detection
Multilingual Automatic Speech Recognition with word-level timestamps
Accessible large language models via k-bit quantization for PyTorch
MemU is an open-source memory framework for AI companions
Official inference framework for 1-bit LLMs