A text-to-speech, speech-to-text and speech-to-speech library
Open-source framework for intelligent speech interaction
LLM-based Reinforcement Learning audio edit model
The official Python library for the Fish Audio API
Tokenizer-Free TTS for Multilingual Speech Generation
Instant voice cloning by MIT and MyShell. Audio foundation model
Miso TTS is an 8 billion, highly emotive text-to-speech model
TTS model capable of streaming conversational audio in realtime
MOSS-TTS-Nano is an open-source multilingual tiny speech generation
A nearly-live implementation of OpenAI's Whisper
Open-source multi-speaker long-form text-to-speech model
SOTA Open Source TTS
Capable of understanding text, audio, vision, video
MARS5 speech model (TTS) from CAMB.AI
Generate audiobooks from e-books
MOSS‑TTS Family open‑source speech and sound generation model
High-Quality Voice Cloning TTS for 600+ Languages
Interface for OuteTTS models
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Towards Human-Sounding Speech
An Open Source text-to-speech system built by inverting Whisper
Controllable & emotion-expressive zero-shot TTS
A sound cloning tool with a web interface, using your voice
Industrial-level controllable zero-shot text-to-speech system