Search Results for "audio source separation" - Page 6

Sort By:

Showing 5991 open source projects for "audio source separation"

View related business solutions

Linux Clear Filters & Widen Search

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

Decompose

Kotlin Multiplatform lifecycle-aware business logic components

Decompose is a Kotlin Multiplatform library for breaking down your code into tree-structured lifecycle-aware business logic components (aka BLoC), with routing functionality and pluggable UI (Jetpack/Multiplatform Compose, Android Views, SwiftUI, Kotlin/React, etc.).

Downloads: 2 This Week

Last Update: 2026-03-15
See Project
2

Moshi

A speech-text foundation model for real time dialogue

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...

Downloads: 0 This Week

Last Update: 2024-11-05
See Project
3

TradingAgents

Chinese Financial Trading Framework Based on Multi-Agent LLM

TradingAgents-CN is a Chinese-enhanced, multi-agent LLM framework aimed at building financial analysis and trading-oriented workflows, with an emphasis on collaboration between specialized agents rather than a single monolithic prompt. It organizes market-related tasks into roles and stages so different agents can contribute research, reasoning, aggregation, and decision support in a structured pipeline. The project is oriented toward practical usage, including a stack that can be run in a...

Downloads: 8 This Week

Last Update: 2026-04-14
See Project
4

Kaset

The missing YouTube Music macOS app

Kaset is a social audio platform framework that allows users to host, share, and interact with audio content in community-oriented spaces, combining elements of podcasting, voice rooms, and feedback-driven discovery. It provides an interface where creators can upload episodes, host live or scheduled voice sessions, and cultivate listener communities through comments, reactions, and follow systems. The platform emphasizes audio discovery with playlists, curated channels, and trending audio...

Downloads: 1 This Week

Last Update: 2026-03-28
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
5

DbTpl

Command line tool to generate idiomatic Go code for SQL databases

dbtpl is a Go-based tool that brings templated SQL into your codebase in a clean and structured way. It allows developers to define raw SQL queries in .tpl.sql files and compile them into Go code with strong typing and minimal overhead. By combining SQL’s power with Go’s safety and performance, dbtpl is ideal for teams that want full SQL control without sacrificing maintainability.

Downloads: 0 This Week

Last Update: 2025-06-04
See Project
6

Sherloq

An open source digital image forensic toolset

Sherloq is a research-oriented toolkit designed for digital image forensics, providing an integrated environment to experiment with algorithms for image analysis and tampering detection. Rather than functioning as an automated decision-making system, it serves as a companion tool for researchers, enthusiasts, and students who want to explore forensic techniques from scientific literature and workshops. The project emphasizes transparency and community collaboration, contrasting with...

Downloads: 11 This Week

Last Update: 5 days ago
See Project
7

Handy STT

A free, open source, and extensible speech-to-text application

Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. ...

Downloads: 50 This Week

Last Update: 2026-04-02
See Project
8

Laravel Doctrine ORM

A drop-in Doctrine ORM 2 implementation for Laravel 5+

Doctrine 2 is an object-relational mapper (ORM) for PHP that provides transparent persistence for PHP objects. It uses the Data Mapper pattern at the heart, aiming for a complete separation of your domain/business logic from the persistence in a relational database management system. The benefit of Doctrine for the programmer is the ability to focus on object-oriented business logic and worry about persistence only as a secondary problem. This doesn’t mean persistence is downplayed by...

Downloads: 4 This Week

Last Update: 2026-04-12
See Project
9

Bili23 Downloader

Cross platform GUI tool for downloading videos from Bilibili sites

Bili23-Downloader is an open source desktop application designed for downloading video content from the Bilibili platform. It provides a graphical interface that allows users to download various types of media including user-uploaded videos, series episodes, movies, and other hosted content. It focuses on ease of use with a zero-configuration setup, making it accessible to both beginners and experienced users.

Downloads: 13 This Week

Last Update: 2026-04-07
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

yami

An open-source music player with simple UI

Yami is a lightweight, open-source music player built in Python. It focuses on simplicity and ease of use, providing an intuitive user interface (UI) for users to manage and play their music. Whether you're playing local files or downloading from online sources using spotdl, Yami offers a seamless experience. This project is designed for users who want a minimalistic, cross-platform music player with the ability to integrate external sources like Spotify/YouTube Music.

Downloads: 3 This Week

Last Update: 2025-11-03
See Project
11

Qwen2.5-Omni

Capable of understanding text, audio, vision, video

Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...

Downloads: 1 This Week

Last Update: 2025-09-23
See Project
12

Lidify

Lidify is built for music lovers who want the convenience of streaming

Lidify is a self-hosted, on-demand audio streaming platform that aims to deliver a Spotify-like experience while keeping your music library fully under your control. You point it at your personal collection, and it scans, catalogs, and enriches your library with metadata so browsing feels polished instead of “folder-based.” Beyond basic playback, it leans into discovery with personalized “made for you” mixes and one-click radio modes that generate stations from your own listening history and...

Downloads: 3 This Week

Last Update: 5 days ago
See Project
13

OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model

OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak...

Downloads: 26 This Week

Last Update: 2025-11-28
See Project
14

Transformers

State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

Hugging Face Transformers provides APIs and tools to easily download and train state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks...

Downloads: 7 This Week

Last Update: 2026-04-13
See Project
15

JSNES

A JavaScript NES emulator

JSNES is a JavaScript-based emulator that replicates the functionality of the Nintendo Entertainment System (NES), enabling classic games to run directly in web browsers or Node.js environments. It implements the core components of NES hardware, including the CPU, graphics processing unit, and audio system, to deliver an accurate emulation experience. The project is designed as a library, allowing developers to embed emulation capabilities into web applications or custom interfaces. It...

Downloads: 7 This Week

Last Update: 2026-04-12
See Project
16

media-chrome

Custom elements (web components) for making audio and video player

media-chrome is an open source library that provides fully customizable media player controls using native web components, allowing developers to design consistent and flexible audio and video player interfaces across different platforms and frameworks. Instead of relying on default browser controls or proprietary player APIs, Media Chrome introduces a set of reusable custom elements that can be composed using standard HTML, styled with CSS, and integrated into any JavaScript framework including React, Angular, and Svelte. ...

Downloads: 4 This Week

Last Update: 2026-04-13
See Project
17

Lumos Engine

Cross-Platform C++ 2D/3D game engine

Cross-platform 2D and 3D Game Engine written in C++ that supports both OpenGL and Vulkan. Support for Windows, Linux, macOS. Support for OpenGL/Vulkan. D audio using OpenAL. Rendering 3D models with PBR shading. Debug gui using ImGui 3D collision detection - cuboid/sphere/pyramid. 2D collision detection - Box2D. Basic lua scripting support.

Downloads: 8 This Week

Last Update: 2024-09-14
See Project
18

WavTokenizer

SOTA discrete acoustic codec models with 40/75 tokens per second

WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
19

Audiogen Codec

48khz stereo neural audio codec for general audio

AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games. We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic...

Downloads: 0 This Week

Last Update: 2024-10-02
See Project
20

JUCE

JUCE is an open-source cross-platform C++ application framework

JUCE is an open-source cross-platform C++ application framework for creating high-quality desktop and mobile applications, including VST, VST3, AU, AUv3, RTAS and AAX audio plug-ins. JUCE can be easily integrated with existing projects via CMake, or can be used as a project generation tool via the Projucer, which supports exporting projects for Xcode (macOS and iOS), Visual Studio, Android Studio, Code::Blocks and Linux Makefiles as well as containing a source code editor. ...

Downloads: 18 This Week

Last Update: 2025-12-16
See Project
21

DistroAV

DistroAV (formerly OBS-NDI): NDI integration for OBS Studio

DistroAV is an open-source integration plugin for OBS Studio that provides Network Device Interface (NDI) support so users can send and receive live audio and video over IP networks directly within OBS, formerly known as the OBS-NDI project before being renamed. By implementing NDI input sources, dedicated output transports, and special filter modes, it allows creativity-oriented workflows such as capturing remote cameras, sharing scenes between machines, or distributing live feeds without capture cards or physical cabling. ...

Downloads: 50 This Week

Last Update: 1 day ago
See Project
22

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
23

Lyrion Music Server

Server for Squeezebox and compatible players

SlimServer, better known as Logitech Media Server, is a lightweight, extensible music server that indexes your library and streams audio to hardware players and software clients around the house. It scans folders for music and cover art, builds fast search indexes, and serves rich metadata to web, mobile, and player UIs. Playback can be synchronized across rooms with per-player volume and latency controls, and it supports a wide range of formats via built-in decoders and on-the-fly...

Downloads: 8 This Week

Last Update: 2026-02-17
See Project
24

camofox-browser

Headless browser automation server for AI agents to visit sites

camofox-browser is a headless browser automation server built specifically for AI agents that need to interact with websites that often block standard automation stacks. It wraps Camoufox, a Firefox fork that performs fingerprint spoofing at the C++ level, which means many browser characteristics are altered before page scripts can inspect them, rather than relying on JavaScript-layer stealth patches. The project is designed around a REST API, making it easier for agents and external tools...

Downloads: 10 This Week

Last Update: 3 days ago
See Project
25

Butterchurn

Butterchurn is a WebGL implementation of the Milkdrop Visualizer

Butterchurn is a WebGL-based music visualization engine that recreates the classic MilkDrop visualizer experience entirely in the browser using modern web technologies. It is designed to render complex, real-time audio-reactive graphics that respond dynamically to music input, producing highly immersive and fluid visual effects. The engine uses GPU acceleration through WebGL to achieve high performance, allowing it to handle intricate shader-based visualizations without overwhelming system...

Downloads: 1 This Week

Last Update: 2026-04-08
See Project