VITS2 backbone with multilingual-bert
A high-quality rapid TTS voice cloning model
Towards Human-Sounding Speech
Toolkit for conversational AI
A fast TTS architecture with conditional flow matching
Scalable generative AI framework built for researchers and developers
Toolkit for audio, music, and speech generation
SOTA discrete acoustic codec models with 40/75 tokens per second
Best practice TTS based on BERT and VITS
Implementation of a Transformer based neural network
Conditional Variational Autoencoder with Adversarial Learning
TensorFlow Implementation of DC-TTS: yet another text-to-speech model