A text-to-speech, speech-to-text and speech-to-speech library
Generate audiobooks from EPUBs, PDFs and text with captions
Generate audiobooks from e-books, voice cloning & 1107+ languages
Tokenizer-Free TTS for Multilingual Speech Generation
A nearly-live implementation of OpenAI's Whisper
Synchronized Translation for Videos
Comprehensive Gradio WebUI for audio processing
Instant voice cloning by MIT and MyShell. Audio foundation model
SOTA Open Source TTS
SOTA discrete acoustic codec models with 40/75 tokens per second
Free, high-quality text-to-speech API endpoint to replace OpenAI
Interface for OuteTTS models
The official Python SDK for the ElevenLabs API
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
EPUB to audiobook converter, optimized for Audiobookshelf
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Use Microsoft Edge's online text-to-speech service from Python
Offline Text To Speech synthesis for python
Converts text to speech in realtime
One-click deployment (including offline integration package)
Controllable & emotion-expressive zero-shot TTS
A fast TTS architecture with conditional flow matching
MARS5 speech model (TTS) from CAMB.AI
Towards Human-Sounding Speech
Automatically translates the text of a video based on a subtitle file