Minimalistic audiobook player
Comprehensive Gradio WebUI for audio processing
A sound cloning tool with a web interface, using your voice
Fast and accurate automatic speech recognition (ASR) for edge devices
Speech to Text to Speech, sends text as OSC messages
GLM-4-Voice | End-to-End Chinese-English Conversational Model
The open-source voice synthesis studio powered by Qwen3-TTS
Clone a voice in 5 seconds to generate arbitrary speech in real-time
A simple, high-quality voice conversion tool focused on ease of use
1 min voice data can also be used to train a good TTS model
In-App assistant SDK to build a multimodal conversational UX websites
Instant voice cloning by MIT and MyShell. Audio foundation model
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
Build voice-based LLM agents. Modular + open source
Telegram Desktop messaging app
High-Quality Voice Cloning TTS for 600+ Languages
Qwen3-TTS is an open-source series of TTS models
In-App assistant SDK to build a multimodal conversational UX for iOS
Assistant SDK to build a multimodal conversational UX for Android
Conversational voice AI agents
On-device wake word detection powered by deep learning
Open source AI VTuber platform with voice chat and Live2D avatars
Adds support for Yandex Smart Home (Alice voice assistant)
Framework for building real-time voice and multimodal AI agents
A high-quality rapid TTS voice cloning model