Multilingual speech recognition and audio understanding model
Audio foundation model excelling in audio understanding
Offline speech recognition API for Android, iOS, Raspberry Pi
SOTA Open Source TTS
Repo of Qwen2-Audio chat & pretrained large audio language model
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Robust Speech Recognition via Large-Scale Weak Supervision
Speech-to-text, text-to-speech, and speaker recognition
Open-source framework for intelligent speech interaction
Speech recognition module for Python
A Lightweight Face Recognition and Facial Attribute Analysis
LLM-based Reinforcement Learning audio edit model
Open-source industrial-grade ASR models
kaldi-asr/kaldi is the official location of the Kaldi project
Speech recognition for your site
Captcha solver extension for humans
Fast and accurate automatic speech recognition (ASR) for edge devices
Towards Human-Sounding Speech
On-device Speech Recognition for Apple Silicon
A free, open source, and extensible speech-to-text application
Port of OpenAI's Whisper model in C/C++
Multilingual Automatic Speech Recognition with word-level timestamps
A PyTorch-based Speech Toolkit
Cross-platform AI language practice app
Tokenizer-Free TTS for Multilingual Speech Generation