Industrial-level controllable zero-shot text-to-speech system
Open-source framework for intelligent speech interaction
Controllable & emotion-expressive zero-shot TTS
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Long-form streaming TTS system for multi-speaker dialogue generation
Dia-1.6B generates lifelike English dialogue and vocal expressions