A text-to-speech, speech-to-text and speech-to-speech library
Generate audiobooks from EPUBs, PDFs and text with captions
Toolkit for audio, music, and speech generation
Synchronized Translation for Videos
A nearly-live implementation of OpenAI's Whisper
Generate audiobooks from e-books, voice cloning & 1107+ languages
Free, high-quality text-to-speech API endpoint to replace OpenAI
SOTA discrete acoustic codec models with 40/75 tokens per second
Interface for OuteTTS models
Comprehensive Gradio WebUI for audio processing
Instant voice cloning by MIT and MyShell. Audio foundation model
SOTA Open Source TTS
MARS5 speech model (TTS) from CAMB.AI
A high-quality rapid TTS voice cloning model
Towards Human-Sounding Speech
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Use Microsoft Edge's online text-to-speech service from Python
A sound cloning tool with a web interface, using your voice
The official Python SDK for the ElevenLabs API
High-quality multi-lingual text-to-speech library by MyShell.ai
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Automatically translates the text of a video based on a subtitle file
A lightweight text-to-speech model with zero-shot voice cloning
Controllable & emotion-expressive zero-shot TTS