Instant voice cloning by MIT and MyShell. Audio foundation model
A Family of Open Sourced Music Foundation Models
Official inference repo for FLUX.2 models
Interface for OuteTTS models
Reference implementations of MLPerf™ training benchmarks
High-performance neural network inference framework for mobile
SOTA Open Source TTS
A lightweight text-to-speech model with zero-shot voice cloning
Taming Stable Diffusion for Lip Sync
GUI for a Vocal Remover that uses Deep Neural Networks
High-Resolution Image Synthesis with Latent Diffusion Models
Multi-lingual large voice generation model, providing inference
ONNX Runtime: cross-platform, high performance ML inferencing
A high-quality rapid TTS voice cloning model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
1 min voice data can also be used to train a good TTS model
Ultra-Efficient AI Assistant in Go
Free and Open Source AI Image Upscaler for Linux, MacOS and Windows
MARS5 speech model (TTS) from CAMB.AI
Cross platform .Net wrapper to the OpenCV image processing library
Towards Human-Level Text-to-Speech through Style Diffusion
A sound cloning tool with a web interface, using your voice
Flutter-based cross-platform app integrating major AI models
Free, local, open-source Cowork for Gemini CLI, Claude Code, Codex
Python inference and LoRA trainer package for the LTX-2 audio–video