Robust Speech Recognition via Large-Scale Weak Supervision
Speech-to-text, text-to-speech, and speaker recognition
kaldi-asr/kaldi is the official location of the Kaldi project
Audio foundation model excelling in audio understanding
A PyTorch-based Speech Toolkit
Captcha solver extension for humans
Fast and accurate automatic speech recognition (ASR) for edge devices
A free, open source, and extensible speech-to-text application
Port of OpenAI's Whisper model in C/C++
Cross-platform AI language practice app
Multilingual Automatic Speech Recognition with word-level timestamps
StreamSpeech is a seamless model for offline speech recognition
Voice Recognition to Text Tool
Toolkit for conversational AI
OpenVINO™ Toolkit repository
Underthesea - Vietnamese NLP Toolkit
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
A cross-platform software for text translation and recognition
Repo of Qwen2-Audio chat & pretrained large audio language model
Speech to Text to Speech, sends text as OSC messages
Training data (data labeling, annotation, workflow) for all data types
Translate the video from one language to another and embed dubbing
Capable of understanding text, audio, vision, video
The behavior guidance framework for customer-facing LLM agents
Omnilingual ASR Open-Source Multilingual SpeechRecognition