Instant voice cloning by MIT and MyShell. Audio foundation model
Interface for OuteTTS models
MARS5 speech model (TTS) from CAMB.AI
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
A python package to analyze and compare voices with deep learning
Dia-1.6B generates lifelike English dialogue and vocal expressions