Towards Human-Level Text-to-Speech through Style Diffusion
Industrial-level controllable zero-shot text-to-speech system
A text-to-speech, speech-to-text and speech-to-speech library
Interface for OuteTTS models
Real-time voice interactive digital human
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Generative Adversarial Networks for Efficient and High Fidelity Speech