Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Speech Note Linux app. Note taking, reading and translating
Robust Speech Recognition via Large-Scale Weak Supervision
End-to-end speech processing toolkit
Toolkit for conversational AI
A free, open source, and extensible speech-to-text application
Automatic Speech Recognition with Word-level Timestamps
Comprehensive Gradio WebUI for audio processing
Underthesea - Vietnamese NLP Toolkit
OpenVINO™ Toolkit repository
Persian NLP Toolkit
Stanford CoreNLP, a Java suite of core NLP tools
Generate audiobooks from EPUBs, PDFs and text with captions
Faster Whisper transcription with CTranslate2
Han Language Processing
Open Source Speech Language Model
Fast multimodal LLM for real-time voice interaction and AI apps
Use Microsoft Edge's online text-to-speech service from Python
Open-source multi-speaker long-form text-to-speech model
Audio foundation model excelling in audio understanding
AI-powered tool for generating, optimizing, and translating subtitles
Training data (data labeling, annotation, workflow) for all data types
Voice Recognition to Text Tool
Video translation and dubbing tool powered by LLMs
Run local LLMs like llama, deepseek, kokoro etc. inside your browser