Speech-AI-Forge is a project developed around TTS generation model
MOSS-TTS-Nano is an open-source multilingual tiny speech generation
State-of-the-art TTS model under 25MB
Qwen3-TTS is an open-source series of TTS models
Miso TTS is an 8 billion, highly emotive text-to-speech model
GLM-4-Voice | End-to-End Chinese-English Conversational Model
FAIR Sequence Modeling Toolkit 2
Industrial-level controllable zero-shot text-to-speech system
Open-source multi-speaker long-form text-to-speech model
MOSS‑TTS Family open‑source speech and sound generation model
SoTA open-source TTS
Long-form streaming TTS system for multi-speaker dialogue generation
Controllable & emotion-expressive zero-shot TTS
LLM-based Reinforcement Learning audio edit model
Open-source framework for intelligent speech interaction
Capable of understanding text, audio, vision, video
A Conversational Speech Generation Model
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Dia-1.6B generates lifelike English dialogue and vocal expressions