Generate audiobooks from e-books, voice cloning & 1107+ languages
A high-quality rapid TTS voice cloning model
Synchronized Translation for Videos
Generate audiobooks from e-books
Qwen3-TTS is an open-source series of TTS models
State-of-the-art TTS model under 25MB
SOTA Open Source TTS
TTS with kokoro and onnx runtime
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Build Vision Agents quickly with any model or video provider
End-to-end speech processing toolkit
Foundational model for human-like, expressive TTS
Scalable generative AI framework built for researchers and developers
An Open Source text-to-speech system built by inverting Whisper
Towards Human-Sounding Speech
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Multi-lingual large voice generation model, providing inference
Long-form streaming TTS system for multi-speaker dialogue generation
Spark-TTS Inference Code
A lightweight text-to-speech model with zero-shot voice cloning
Framework for building neural networks
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Controllable & emotion-expressive zero-shot TTS
Controllable and fast Text-to-Speech for over 7000 languages
A TTS model capable of generating ultra-realistic dialogue