Search Results for "audio source separation"

Sort By:

Showing 8171 open source projects for "audio source separation"

View related business solutions

Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
1

Ultimate Vocal Remover (UVR5)

GUI for a Vocal Remover that uses Deep Neural Networks

This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).

Downloads: 846 This Week

Last Update: 2025-01-20
See Project
2

OpenVINO AI Plugins for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.

Downloads: 123 This Week

Last Update: 2024-12-20
See Project
3

Step-Audio

Open-source framework for intelligent speech interaction

Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and produces speech accordingly, enabling natural dialogue, voice cloning, and expressive speech synthesis. ...

Downloads: 4 This Week

Last Update: 2026-03-16
See Project
4

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...

Downloads: 4 This Week

Last Update: 2026-03-30
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

Librosa

Python library for audio and music analysis

Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.

Downloads: 5 This Week

Last Update: 2025-07-03
See Project
6

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
7

Qwen2-Audio

Repo of Qwen2-Audio chat & pretrained large audio language model

Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
8

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...

Downloads: 3 This Week

Last Update: 2025-09-23
See Project
9

Fun Audio Chat

Large Audio Language Model built for natural interactions

Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system...

Downloads: 1 This Week

Last Update: 2026-02-27
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

Step-Audio 2

Multi-modal large language model designed for audio understanding

Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. It...

Downloads: 0 This Week

Last Update: 2026-03-16
See Project
11

Audio Priority Bar

A native macOS menu bar app for managing audio device priorities

Audio Priority Bar is a lightweight macOS utility that gives users precise control over how audio output is prioritized across different apps and devices, filling a gap in the system audio stack that Apple doesn’t natively expose. Once installed, it places an always-accessible control in the menu bar that lets you assign priority levels to individual audio sources so that more important sounds (like alerts, calls, or music) can override or duck less important ones (like background noise or...

Downloads: 0 This Week

Last Update: 2026-02-03
See Project
12

Step-Audio-EditX

LLM-based Reinforcement Learning audio edit model

Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. ...

Downloads: 0 This Week

Last Update: 2026-04-09
See Project
13

MusicFreePlugins

MusicFreePlayPlugin

The MusicFreePluginsc project is a collection and framework for plugins that extend the functionality of the MusicFree ecosystem by providing access to various music sources and features. It defines a standardized interface for plugin development, allowing contributors to implement features such as search, playback, and metadata retrieval. The system is designed to be modular, enabling users to install, update, and manage plugins independently of the core application. It supports multiple...

Downloads: 10 This Week

Last Update: 6 days ago
See Project
14

audioFlux

A library for audio and music analysis, feature extraction

...It can be provided to deep learning networks for training and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) ASR, etc.

Downloads: 0 This Week

Last Update: 2024-08-09
See Project
15

Voice-Pro

Comprehensive Gradio WebUI for audio processing

Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.

1 Review

Downloads: 30 This Week

Last Update: 2025-12-05
See Project
16

OBS Studio

Open source software for live streaming and recording

OBS Studio, also known as Open Broadcaster Software, is a free and open source software program for live streaming and video recording. Features of the software include device/source capture, recording, encoding and broadcasting. Stream on Windows, Mac or Linux. This software is commonly used by video game streamers on the popular streaming platform Twitch.

11 Reviews

Downloads: 254 This Week

Last Update: 1 day ago
See Project
17

Whisper-WebUI

A Web UI for easy subtitle using whisper model

Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
18

react-native-audio-recorder-player

react-native native module for audio recorder and player

This is a react-native link module for the audio recorder and player. This is not a playlist audio module and this library provides simple recorder and player functionalities for both android and ios platforms. This only supports the default file extension for each platform. This module can also handle files from URLs.

Downloads: 0 This Week

Last Update: 2025-09-06
See Project
19

eqMac

macOS System-wide audio equalizer & volume mixer

System audio equalizer for macOS. Professional grade Parametric EQ & volume mixer. If you feel like your audio device (Headphones or Speaker) does not have enough Bass (low frequency) punch, or vice versa, you can adjust that using eqMac. macOS does not have a direct way to access the System Audio stream, so we use the eqMac Audio driver to divert the system audio to the driver's input stream. Then eqMac captures that Input audio stream processes it, and sends it directly to the output...

Downloads: 76 This Week

Last Update: 7 days ago
See Project
20

BlackHole

BlackHole is a modern macOS audio loopback driver

...The driver integrates directly with macOS Core Audio and appears in Audio MIDI Setup and supported audio applications. Designed with performance and stability in mind, BlackHole works on both Intel and Apple Silicon Macs without requiring kernel extensions or system security modifications. As an open-source project, it offers transparency, customization options, and active community-driven development.

Downloads: 79 This Week

Last Update: 2025-02-06
See Project
21

LosslessCut

The swiss army knife of lossless video/audio editing

LosslessCut aims to be the ultimate cross platform FFmpeg GUI for extremely fast and lossless operations on video, audio, subtitle and other related media files. The main feature is lossless trimming and cutting of video and audio files, which is great for saving space by rough-cutting your large video files taken from a video camera, GoPro, drone, etc. It lets you quickly extract the good parts from your videos and discard many gigabytes of data without doing a slow re-encode and thereby...

6 Reviews

Downloads: 134 This Week

Last Update: 2026-01-29
See Project
22

Cider App

A new cross-platform Apple Music experience based on Electron and Vue

An open-source, community-oriented Apple Music client for Windows, Linux, macOS, and more. Whether it be Discord, LastFM, or even equalizers we've got you covered. Discord & Last.fm Integration. Quickly share and show others what you're listening to; right out of the box. Audio Enhancements. Audio Spatialization, Adrenaline Processor™, and Equalizers are all available and actively engineered by our Audio Engineer, Maikiwi.

Downloads: 107 This Week

Last Update: 2024-05-18
See Project
23

Seal

Video/Audio Downloader for Android, based on yt-dlp

Video/Audio Downloader for Android. Download videos and audio files from video platforms supported by yt-dlp (formerly youtube-dl). UI and logic written with pure Kotlin. Single activity, no fragments, only composable destinations.

Downloads: 174 This Week

Last Update: 2024-10-16
See Project
24

NeuralNote

Audio Plugin for Audio to MIDI transcription using deep learning

NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...

Downloads: 62 This Week

Last Update: 2026-03-12
See Project
25

TTS WebUI

A single Gradio + React WebUI with extensions for ACE-Step

TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...

Downloads: 1 This Week

Last Update: 15 hours ago
See Project