Minimalistic audiobook player
Comprehensive Gradio WebUI for audio processing
A sound cloning tool with a web interface, using your voice
Fast and accurate automatic speech recognition (ASR) for edge devices
MEGA Android App
Speech to Text to Speech, sends text as OSC messages
GLM-4-Voice | End-to-End Chinese-English Conversational Model
The open-source voice synthesis studio powered by Qwen3-TTS
Clone a voice in 5 seconds to generate arbitrary speech in real-time
1 min voice data can also be used to train a good TTS model
A simple, high-quality voice conversion tool focused on ease of use
Instant voice cloning by MIT and MyShell. Audio foundation model
In-App assistant SDK to build a multimodal conversational UX websites
A fan port of Cave Story for the Sega Mega Drive
Conversational voice AI agents
Telegram Desktop messaging app
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
Build voice-based LLM agents. Modular + open source
High-Quality Voice Cloning TTS for 600+ Languages
Real-time voice interactive digital human
In-App assistant SDK to build a multimodal conversational UX for iOS
Assistant SDK to build a multimodal conversational UX for Android
On-device wake word detection powered by deep learning
The behavior guidance framework for customer-facing LLM agents
Qwen3-TTS is an open-source series of TTS models