Instant voice cloning by MIT and MyShell. Audio foundation model
A Family of Open Sourced Music Foundation Models
Official inference repo for FLUX.2 models
Interface for OuteTTS models
SOTA Open Source TTS
A lightweight text-to-speech model with zero-shot voice cloning
Reference implementations of MLPerf™ training benchmarks
Taming Stable Diffusion for Lip Sync
GUI for a Vocal Remover that uses Deep Neural Networks
High-Resolution Image Synthesis with Latent Diffusion Models
Multi-lingual large voice generation model, providing inference
1 min voice data can also be used to train a good TTS model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A high-quality rapid TTS voice cloning model
MARS5 speech model (TTS) from CAMB.AI
Towards Human-Level Text-to-Speech through Style Diffusion
A sound cloning tool with a web interface, using your voice
Python inference and LoRA trainer package for the LTX-2 audio–video
Run Local LLMs on Any Device. Open-source
The official Meta Llama 3 GitHub site
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Specification and documentation for Agent Skills
Implementation of Vision Transformer, a simple way to achieve SOTA
Set of tools to assess and improve LLM security