Audio Plugin for Audio to MIDI transcription using deep learning
A text-to-speech, speech-to-text and speech-to-speech library
Automatic Speech Recognition with Word-level Timestamps
Self-hosted AI audio transcription
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Fast and accurate automatic speech recognition (ASR) for edge devices
A private, local meeting notes assistant
A free, open source, and extensible speech-to-text application
Generate blog articles from video or audio
An opinionated CLI to transcribe Audio files w/ Whisper on-device
A Family of Open Sourced Music Foundation Models
Open Source AI Dictation App
A Web UI for easy subtitle using whisper model
A lightweight audio-to-MIDI converter with pitch bend detection
Comprehensive Gradio WebUI for audio processing
Synchronized Translation for Videos
A nearly-live implementation of OpenAI's Whisper
Qwen3-ASR is an open-source series of ASR models
Multilingual speech recognition and audio understanding model
Give Claude the ability to watch and understand videos
The official .NET library for the OpenAI API
Convert files and web content into clean, usable Markdown easily
Voice Recognition to Text Tool
Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper
AI tool converting video/audio into structured documents instantly