Fast stable diffusion on CPU and AI PC
Fast inference engine for Transformer models
Port of OpenAI's Whisper model in C/C++
High-speed Large Language Model Serving for Local Deployment
A system monitoring tool that exposes system metrics
Running large language models on a single GPU
Easy-to-use deep learning framework with 3 key features
Ultra-Efficient AI Assistant in Go
Real-time NVIDIA GPU dashboard
LLM inference in C/C++
Claude Code plugin that automatically captures everything Claude does
A high-quality rapid TTS voice cloning model
Python-free Rust inference server
AirLLM 70B inference with single 4GB GPU
Fastest, smallest, and fully autonomous AI assistant infrastructure
The best way to use Hermes Agent from the web or from your phone
A modern model graph visualizer and debugger
Building an Intelligent Agent from Scratch
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A TTS that fits in your CPU (and pocket)
Accessible large language models via k-bit quantization for PyTorch
MemU is an open-source memory framework for AI companions
A personal AI assistant, easy to install
Official inference framework for 1-bit LLMs
Fast and accurate AI powered file content types detection