Minimalistic audiobook player
Comprehensive Gradio WebUI for audio processing
A sound cloning tool with a web interface, using your voice
Fast and accurate automatic speech recognition (ASR) for edge devices
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Speech to Text to Speech, sends text as OSC messages
The open-source voice synthesis studio powered by Qwen3-TTS
Clone a voice in 5 seconds to generate arbitrary speech in real-time
A simple, high-quality voice conversion tool focused on ease of use
1 min voice data can also be used to train a good TTS model
In-App assistant SDK to build a multimodal conversational UX websites
Instant voice cloning by MIT and MyShell. Audio foundation model
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
Telegram Desktop messaging app
Build voice-based LLM agents. Modular + open source
High-Quality Voice Cloning TTS for 600+ Languages
Real-time voice interactive digital human
In-App assistant SDK to build a multimodal conversational UX for iOS
On-device wake word detection powered by deep learning
Conversational voice AI agents
Assistant SDK to build a multimodal conversational UX for Android
The behavior guidance framework for customer-facing LLM agents
Qwen3-TTS is an open-source series of TTS models
A high-quality rapid TTS voice cloning model
Adds support for Yandex Smart Home (Alice voice assistant)