Multi-lingual large voice generation model, providing inference
SOTA Open Source TTS
TTS model capable of streaming conversational audio in realtime
Instant voice cloning by MIT and MyShell. Audio foundation model
An Open Source text-to-speech system built by inverting Whisper
MOSS‑TTS Family open‑source speech and sound generation model
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Long-form streaming TTS system for multi-speaker dialogue generation
A TTS model capable of generating ultra-realistic dialogue
LLM-based Reinforcement Learning audio edit model
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Conditional Variational Autoencoder with Adversarial Learning