Qwen3-ASR is an open-source series of ASR models
Omnilingual ASR Open-Source Multilingual SpeechRecognition
kaldi-asr/kaldi is the official location of the Kaldi project
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Bailing is a voice dialogue robot similar to GPT-4o
StreamSpeech is a seamless model for offline speech recognition
Audio foundation model excelling in audio understanding
Real-time voice interactive digital human
Video translation and dubbing tool powered by LLMs
Port of OpenAI's Whisper model in C/C++
Easy-to-use Speech Toolkit including Self-Supervised Learning model
End-to-end speech processing toolkit
Open-source framework for intelligent speech interaction
Speech-AI-Forge is a project developed around TTS generation model
Scalable generative AI framework built for researchers and developers
Repo of Qwen2-Audio chat & pretrained large audio language model
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Conversational voice AI agents
Fast and accurate automatic speech recognition (ASR) for edge devices
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Open source AI VTuber platform with voice chat and Live2D avatars
Speech Note Linux app. Note taking, reading and translating
A library for audio and music analysis, feature extraction
In-App assistant SDK to build a multimodal conversational UX websites
Framework for building AI-powered interactive digital humans and agent