A sound cloning tool with a web interface, using your voice
Towards Human-Sounding Speech
Python library and CLI tool to interface with Google Translate
Build Vision Agents quickly with any model or video provider
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Qwen3-TTS is an open-source series of TTS models
A lightweight text-to-speech model with zero-shot voice cloning
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
State-of-the-art TTS model under 25MB
Generate audiobooks from e-books
StreamSpeech is a seamless model for offline speech recognition
An Open Source text-to-speech system built by inverting Whisper
Multi-lingual large voice generation model, providing inference
A simple native web interface that uses ChatTTS to synthesize text
Controllable and fast Text-to-Speech for over 7000 languages
Foundational model for human-like, expressive TTS
Official MiniMax Model Context Protocol (MCP) server
Toolkit for audio, music, and speech generation
Towards Human-Level Text-to-Speech through Style Diffusion
High-quality multi-lingual text-to-speech library by MyShell.ai
A Conversational Speech Generation Model
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Unofficial Parallel WaveGAN
A webui for different audio related Neural Networks
WaveRNN Vocoder + TTS