Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Robust Speech Recognition via Large-Scale Weak Supervision
Speech recognition module for Python
Multilingual speech recognition and audio understanding model
Open-source industrial-grade ASR models
kaldi-asr/kaldi is the official location of the Kaldi project
Audio foundation model excelling in audio understanding
A PyTorch-based Speech Toolkit
Multilingual Automatic Speech Recognition with word-level timestamps
StreamSpeech is a seamless model for offline speech recognition
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Toolkit for conversational AI
Repo of Qwen2-Audio chat & pretrained large audio language model
Fast multimodal LLM for real-time voice interaction and AI apps
Voice Recognition to Text Tool
Underthesea - Vietnamese NLP Toolkit
Training data (data labeling, annotation, workflow) for all data types
Translate the video from one language to another and embed dubbing
Replace OpenAI GPT with another LLM in your app
End-to-end speech processing toolkit
Persian NLP Toolkit
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Capable of understanding text, audio, vision, video
The behavior guidance framework for customer-facing LLM agents
Open source AI VTuber platform with voice chat and Live2D avatars