SOTA Open Source TTS
State-of-the-art TTS model under 25MB
Industrial-level controllable zero-shot text-to-speech system
Towards Human-Sounding Speech
Toolkit for conversational AI
Miso TTS is an 8 billion, highly emotive text-to-speech model
SOTA discrete acoustic codec models with 40/75 tokens per second
Framework for building neural networks
StreamSpeech is a seamless model for offline speech recognition
Controllable and fast Text-to-Speech for over 7000 languages
Virtual AI anchor that combines state-of-the-art technology
Offline desktop app to convert EPUB to MP3 using Kokoro-82M neural TTS
Towards Human-Level Text-to-Speech through Style Diffusion
Toolkit for audio, music, and speech generation
Unofficial Parallel WaveGAN
Real-Time State-of-the-art Speech Synthesis for Tensorflow 2