VITS2 backbone with multilingual-bert
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Controllable & emotion-expressive zero-shot TTS
Towards Human-Level Text-to-Speech through Style Diffusion
High-quality multi-lingual text-to-speech library by MyShell.ai
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Toolkit for conversational AI
Industrial-level controllable zero-shot text-to-speech system
A text-to-speech, speech-to-text and speech-to-speech library
A fast TTS architecture with conditional flow matching
One-click deployment (including offline integration package)
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Scalable generative AI framework built for researchers and developers
The official Python SDK for the ElevenLabs API
Instant voice cloning by MIT and MyShell. Audio foundation model
Multi-lingual large voice generation model, providing inference
Virtual AI anchor that combines state-of-the-art technology
Converts text to speech in realtime
Framework for building neural networks
SOTA discrete acoustic codec models with 40/75 tokens per second
An Open Source text-to-speech system built by inverting Whisper
Interface for OuteTTS models
Multi-Voice and Prompt-Controlled TTS Engine
Unofficial Parallel WaveGAN
Best practice TTS based on BERT and VITS