ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Long-form streaming TTS system for multi-speaker dialogue generation
Interface for OuteTTS models
Open-source multi-speaker long-form text-to-speech model
MOSS‑TTS Family open‑source speech and sound generation model
A generative speech model for daily dialogue
High-Quality Voice Cloning TTS for 600+ Languages
One-click deployment (including offline integration package)
Instant voice cloning by MIT and MyShell. Audio foundation model
MARS5 speech model (TTS) from CAMB.AI
Foundational model for human-like, expressive TTS
Towards Human-Level Text-to-Speech through Style Diffusion
Best practice TTS based on BERT and VITS
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Conditional Variational Autoencoder with Adversarial Learning
A python package to analyze and compare voices with deep learning
TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Dia-1.6B generates lifelike English dialogue and vocal expressions