Search Results for "audio source separation" - Page 5

Sort By:

Showing 5991 open source projects for "audio source separation"

View related business solutions

Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

JellyBox

Native desktop and mobile music client for Jellyfin

Native desktop client for Jellyfin media library. It provides an easy access to your Jellyfin library.

Downloads: 1 This Week

Last Update: 10 hours ago
See Project
2

PolarDB for PostgreSQL

A cloud-native database based on PostgreSQL developed by Alibaba Cloud

PolarDB for PostgreSQL is Alibaba Cloud's cloud-native, distributed version of PostgreSQL designed for high availability, scalability, and performance. It enhances standard PostgreSQL with features like shared storage, compute-storage separation, and parallel processing. PolarDB supports cloud-native workloads, offering enterprise-grade capabilities while maintaining PostgreSQL compatibility.

Downloads: 1 This Week

Last Update: 2026-03-24
See Project
3

Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM

...It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.

Downloads: 2 This Week

Last Update: 2026-01-08
See Project
4

WhisperLive

A nearly-live implementation of OpenAI's Whisper

WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...

Downloads: 15 This Week

Last Update: 2026-03-17
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

LiveKit

End-to-end stack for WebRTC. SFU media server and SDKs

LiveKit is an open-source project that provides a scalable, multi-user conferencing system based on WebRTC, designed to offer real-time video, audio, and data capabilities for developers.

Downloads: 15 This Week

Last Update: 4 days ago
See Project
6

ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1107+ languages

ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...

Downloads: 29 This Week

Last Update: 18 hours ago
See Project
7

Anki

Anki is a smart spaced repetition flashcard program

Anki is a free, open-source spaced repetition flashcard application designed for efficient long‑term memorization. It supports a wide variety of media types (text, images, audio, LaTeX), advanced scheduling algorithms (SM‑2, FSRS), and extensibility via add‑ons. It’s widely used for education, language learning, medical training, and more.

Downloads: 34 This Week

Last Update: 2025-09-17
See Project
8

abogen

Generate audiobooks from EPUBs, PDFs and text with captions

abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...

Downloads: 7 This Week

Last Update: 2026-02-06
See Project
9

idonthavespotify

Effortlessly convert Spotify links to your preferred streaming service

Copy a link from your favorite streaming service, paste it into the search bar, and voilà! Links to the track on all other supported platforms are displayed. If the original source is Spotify you'll even get a quick audio preview to ensure it's the right track.

Downloads: 2 This Week

Last Update: 2026-04-08
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

PolarDB-X

PolarDB-X is a cloud native distributed SQL Database

PolarDB-X is a cloud-native distributed SQL database designed to handle high concurrency, massive storage, and complex querying scenarios. It features a shared-nothing architecture that decouples computing from storage, providing scalability and flexibility for various applications.

Downloads: 0 This Week

Last Update: 2025-08-22
See Project
11

Real-Time Voice Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that...

Downloads: 15 This Week

Last Update: 2026-03-09
See Project
12

GIN-VUE-ADMIN

Gin-vue-admin is a development platform based on vue and gin

The basic development platform based on vite+vue3+gin (supporting mixed use of TS and JS), integrates jwt authentication, authority management, dynamic routing, visible and hidden controllable components, paging encapsulation, multi-login interception, resource permissions, upload and download, Code generator, form generator and other necessary functions for development. Gin-vue-admin is a development platform based on vue and gin , which is a full-stack front-end and back-end separation. It integrates jwt authentication, dynamic routing, dynamic menu, casbin authentication, form generator, code generator and other functions, providing multiple A sample file allows you to focus more time on business development. Gin-vue-admin is a set of open-source frameworks with a front-end and back-end separation architecture for rapid development, designed to quickly build small and medium-sized projects.

Downloads: 0 This Week

Last Update: 2026-03-27
See Project
13

Phoniebox

A Raspberry Pi jukebox, playing local music, podcasts, web radio

Phoniebox is a contactless jukebox for the Raspberry Pi, that plays audio files, playlists, podcasts, web streams, and Spotify triggered by RFID cards. All plug and play via USB, no soldering iron needed. It also features GPIO button control support.

Downloads: 4 This Week

Last Update: 2026-03-04
See Project
14

Crosvm

The Chrome OS Virtual Machine Monitor

crosvm (ChromeOS Virtual Machine Monitor) is a secure, lightweight virtual machine monitor built on top of the Linux KVM hypervisor. Developed for ChromeOS, it is designed to isolate and execute Linux and Android guests efficiently while maintaining strong security boundaries. Unlike general-purpose emulators like QEMU, crosvm avoids full hardware emulation and focuses on modern paravirtualized I/O using the virtio standard, reducing complexity and attack surface. Written in Rust, it...

Downloads: 16 This Week

Last Update: 2 days ago
See Project
15

Riffusion App

Stable diffusion for real-time music generation (web app)

Riffusion App Hobby is an open-source interactive web application that enables real-time music generation using stable diffusion models adapted for audio synthesis. Unlike traditional music generation tools, it treats audio as spectrogram images and applies diffusion techniques to generate continuous sound transitions, allowing users to create evolving musical loops and compositions.

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
16

MusicFree

Plug-in, customized, ad-free free music player

The MusicFree project is an open-source, plugin-based music player designed for mobile platforms such as Android and HarmonyOS, emphasizing flexibility, customization, and privacy. Unlike traditional music apps, it does not include built-in audio sources but instead relies entirely on plugins to fetch and manage music content. This modular architecture allows users to integrate multiple sources and extend functionality without modifying the core application.

Downloads: 6 This Week

Last Update: 4 days ago
See Project
17

ReClip

Download videos from almost any website

ReClip is a lightweight, self-hosted media downloader that provides a simple web-based interface for downloading videos and audio from a wide range of online platforms. Built around the yt-dlp engine, it supports over a thousand websites, including major platforms like YouTube, TikTok, and Instagram, allowing users to retrieve media content in various formats. The application emphasizes simplicity and minimalism, featuring a clean interface built with plain HTML, CSS, and JavaScript without...

Downloads: 83 This Week

Last Update: 2026-04-09
See Project
18

bfxr

Flash + AIR sound effects generator. Based on Sfxr.

The bfxr project by increpare is a sound-effects generator tool originally built using Flash + AIR, based on the earlier Sfxr project. Its purpose is to enable users, especially game developers and sound designers, to quickly generate retro, 8-bit/“chiptune” style sound effects (“bleeps”, “booms”, “zaps”, etc.) without deep knowledge of audio signal processing. It offers an interactive GUI through which you can tweak many parameters (oscillators, envelopes, filters, etc.) to sculpt custom...

Downloads: 16 This Week

Last Update: 5 days ago
See Project
19

Umami

A simple, fast, website analytics alternative to Google Analytics

Umami is a simple, easy to use, self-hosted web analytics solution. The goal is to provide you with a friendlier, privacy-focused alternative to Google Analytics and a free, open-sourced alternative to paid solutions. Umami collects only the metrics you care about and everything fits on a single page. You can view a live demo here. Umami measures just the important metrics that you care about: pageviews, devices used, and where your visitors are coming from. Everything is displayed on a...

Downloads: 4 This Week

Last Update: 5 days ago
See Project
20

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recognition

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.

Downloads: 155 This Week

Last Update: 4 days ago
See Project
21

SoniTranslate

Synchronized Translation for Videos

SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets the generated dub track stay in sync with the original video structure. ...

Downloads: 18 This Week

Last Update: 2025-11-28
See Project
22

Moshi

A speech-text foundation model for real time dialogue

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...

Downloads: 0 This Week

Last Update: 2024-11-05
See Project
23

Kaset

The missing YouTube Music macOS app

Kaset is a social audio platform framework that allows users to host, share, and interact with audio content in community-oriented spaces, combining elements of podcasting, voice rooms, and feedback-driven discovery. It provides an interface where creators can upload episodes, host live or scheduled voice sessions, and cultivate listener communities through comments, reactions, and follow systems. The platform emphasizes audio discovery with playlists, curated channels, and trending audio...

Downloads: 1 This Week

Last Update: 2026-03-28
See Project
24

BizHawk

BizHawk is a multi-system emulator written in C#

A multi-system emulator written in C#. As well as quality-of-life features for casual players, it also has recording/playback and debugging tools, making it the first choice for TASers (Tool-Assisted Speedrunners). Screenshotting and recording audio + video to file. Firmware management, input, framerate, and more in a HUD over the game. Rebindable hotkeys for controlling the frontend (keyboard+mouse+gamepad). A comprehensive input mapper for the emulated gamepads and other peripherals....

Downloads: 53 This Week

Last Update: 2025-09-20
See Project
25

yami

An open-source music player with simple UI

Yami is a lightweight, open-source music player built in Python. It focuses on simplicity and ease of use, providing an intuitive user interface (UI) for users to manage and play their music. Whether you're playing local files or downloading from online sources using spotdl, Yami offers a seamless experience. This project is designed for users who want a minimalistic, cross-platform music player with the ability to integrate external sources like Spotify/YouTube Music.

Downloads: 3 This Week

Last Update: 2025-11-03
See Project