Build Vision Agents quickly with any model or video provider
TTS with kokoro and onnx runtime
Qwen3-TTS is an open-source series of TTS models
A simple, high-quality voice conversion tool focused on ease of use
Speech-AI-Forge is a project developed around TTS generation model
A TTS that fits in your CPU (and pocket)
Synchronized Translation for Videos
Comprehensive Gradio WebUI for audio processing
Offline Text To Speech synthesis for python
SOTA Open Source TTS
Readest is a modern, feature-rich ebook reader
Instant voice cloning by MIT and MyShell. Audio foundation model
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
High-Quality Voice Cloning TTS for 600+ Languages
The open-source voice synthesis studio powered by Qwen3-TTS
A nearly-live implementation of OpenAI's Whisper
Spark-TTS Inference Code
Offline inference engine for art, real-time voice conversations
One-click deployment (including offline integration package)
Free, high-quality text-to-speech API endpoint to replace OpenAI
An Open Source text-to-speech system built by inverting Whisper
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
A text-to-speech, speech-to-text and speech-to-speech library
Like the macOS say command, but with a modern voice
A single Gradio + React WebUI with extensions for ACE-Step