A sound cloning tool with a web interface, using your voice
A nearly-live implementation of OpenAI's Whisper
A simple native web interface that uses ChatTTS to synthesize text
Offline Text To Speech synthesis for python
VITS2 backbone with multilingual-bert
Offline inference engine for art, real-time voice conversations
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Toolkit for conversational AI
A TTS that fits in your CPU (and pocket)
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Converts text to speech in realtime
An Open Source text-to-speech system built by inverting Whisper
The official Python SDK for the ElevenLabs API
Generate audiobooks from e-books
State-of-the-art TTS model under 25MB
Build Vision Agents quickly with any model or video provider
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Management of Yandex Station and other smart home devices
A fast TTS architecture with conditional flow matching
Towards Human-Sounding Speech
Foundational model for human-like, expressive TTS
StreamSpeech is a seamless model for offline speech recognition
End-to-end speech processing toolkit
Instant voice cloning by MIT and MyShell. Audio foundation model
Multi-lingual large voice generation model, providing inference