Fast stable diffusion on CPU and AI PC
Running large language models on a single GPU
A high-quality rapid TTS voice cloning model
AirLLM 70B inference with single 4GB GPU
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Building an Intelligent Agent from Scratch
The best way to use Hermes Agent from the web or from your phone
A TTS that fits in your CPU (and pocket)
State-of-the-art TTS model under 25MB
Fast and accurate AI powered file content types detection
Multilingual Automatic Speech Recognition with word-level timestamps
Accessible large language models via k-bit quantization for PyTorch
MemU is an open-source memory framework for AI companions
Official inference framework for 1-bit LLMs
A personal AI assistant, easy to install
Unified KV Cache Compression Methods for Auto-Regressive Models
A lightweight, powerful framework for multi-agent workflows
Neural Network architecture based on ideas of the original LSTM
Unified web UI for training and running open models locally
Open-source large language model family from Tencent Hunyuan
Redundancy-aware KV Cache Compression for Reasoning Models
Low-latency AI inference engine optimized for mobile devices
AI Agent Source Code Deep Research Report
ChatGLM2-6B: An Open Bilingual Chat LLM
Faster Whisper transcription with CTranslate2