Audio foundation model excelling in audio understanding
Large Audio Language Model built for natural interactions
Multi-modal large language model designed for audio understanding
GUI for a Vocal Remover that uses Deep Neural Networks
Python library for audio and music analysis
Audio Normalization for Python/ffmpeg
Python Audio Analysis Library: Feature Extraction, Classification
Audiocraft is a library for audio processing and generation
A lightning fast audio upsampler
Robust Speech Recognition via Large-Scale Weak Supervision
Automatic subtitle synchronization tool
AI tool converting video/audio into structured documents instantly
A python module to download twitter spaces
Data manipulation and transformation for audio signal processing
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Download videos from almost any website
Comprehensive Gradio WebUI for audio processing
Decentralize, Self-host Cloud Gaming/Application
Open-source multi-speaker long-form text-to-speech model
AudioMuse-AI is an Open Source Dockerized environment
Videomass is a free, open source and cross-platform GUI for FFmpeg
Translate the video from one language to another and embed dubbing
Automatic Speech Recognition with Word-level Timestamps
An opinionated CLI to transcribe Audio files w/ Whisper on-device