Speech-AI-Forge is a project developed around TTS generation model
Qwen3-TTS is an open-source series of TTS models
MOSS-TTS-Nano is an open-source multilingual tiny speech generation
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Miso TTS is an 8 billion, highly emotive text-to-speech model
Industrial-level controllable zero-shot text-to-speech system
State-of-the-art TTS model under 25MB
Open-source multi-speaker long-form text-to-speech model
FAIR Sequence Modeling Toolkit 2
SoTA open-source TTS
LLM-based Reinforcement Learning audio edit model
MOSS‑TTS Family open‑source speech and sound generation model
Long-form streaming TTS system for multi-speaker dialogue generation
Controllable & emotion-expressive zero-shot TTS
NeuTTS model built from small LLM backbones
On-device TTS model by Neuphonic
Open-source framework for intelligent speech interaction
Capable of understanding text, audio, vision, video
A Conversational Speech Generation Model
AI powered speech denoising and enhancement
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Dia-1.6B generates lifelike English dialogue and vocal expressions