Showing 6656 open source projects for "audio linux"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    JUCE

    JUCE

    JUCE is an open-source cross-platform C++ application framework

    JUCE is an open-source cross-platform C++ application framework for creating high-quality desktop and mobile applications, including VST, VST3, AU, AUv3, RTAS and AAX audio plug-ins. JUCE can be easily integrated with existing projects via CMake, or can be used as a project generation tool via the Projucer, which supports exporting projects for Xcode (macOS and iOS), Visual Studio, Android Studio, Code::Blocks and Linux Makefiles as well as containing a source code editor. JUCE projects can be managed with either the Projucer (JUCE's own project-configuration tool) or with CMake. ...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 4
    Lidify

    Lidify

    Lidify is built for music lovers who want the convenience of streaming

    Lidify is a self-hosted, on-demand audio streaming platform that aims to deliver a Spotify-like experience while keeping your music library fully under your control. You point it at your personal collection, and it scans, catalogs, and enriches your library with metadata so browsing feels polished instead of “folder-based.” Beyond basic playback, it leans into discovery with personalized “made for you” mixes and one-click radio modes that generate stations from your own listening history and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Phoniebox

    Phoniebox

    A Raspberry Pi jukebox, playing local music, podcasts, web radio

    Phoniebox is a contactless jukebox for the Raspberry Pi, that plays audio files, playlists, podcasts, web streams, and Spotify triggered by RFID cards. All plug and play via USB, no soldering iron needed. It also features GPIO button control support.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    S&box

    S&box

    s&box is a modern game engine, built on Valve's Source 2

    S&box is the open-source codebase for s&box, a next-generation sandbox game development platform from the creators of Garry’s Mod that blends modding freedom with modern tooling and performance. Built on a cutting-edge game engine, s&box allows creators to prototype, build, and share interactive game modes, tools, and environments using C#, JavaScript, and visual scripting, promoting accessible content creation for developers of varying skill levels. The platform emphasizes multiplayer and...
    Downloads: 104 This Week
    Last Update:
    See Project
  • 7
    LMSFM Linux

    LMSFM Linux

    Musician-oriented Linux distro

    Let's Make Some F*&^in' Music is a USB-based live Linux distro based on Slackware with the intent of providing a comprehensive music recording and production studio using only FOSS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Riffusion App

    Riffusion App

    Stable diffusion for real-time music generation (web app)

    Riffusion App Hobby is an open-source interactive web application that enables real-time music generation using stable diffusion models adapted for audio synthesis. Unlike traditional music generation tools, it treats audio as spectrogram images and applies diffusion techniques to generate continuous sound transitions, allowing users to create evolving musical loops and compositions. The application is built with modern web technologies including Next.js, React, and three.js, providing a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Ultravox

    Ultravox

    Fast multimodal LLM for real-time voice interaction and AI apps

    Ultravox is an open source multimodal large language model designed specifically for real-time voice-based interactions. It is built to process both text and spoken audio directly, eliminating the need for a separate speech recognition stage and enabling more seamless conversational experiences. Ultravox works by combining text prompts with encoded audio inputs, allowing it to understand spoken language alongside written instructions in a unified pipeline. Internally, it leverages pretrained...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Umami

    Umami

    A simple, fast, website analytics alternative to Google Analytics

    Umami is a simple, easy to use, self-hosted web analytics solution. The goal is to provide you with a friendlier, privacy-focused alternative to Google Analytics and a free, open-sourced alternative to paid solutions. Umami collects only the metrics you care about and everything fits on a single page. You can view a live demo here. Umami measures just the important metrics that you care about: pageviews, devices used, and where your visitors are coming from. Everything is displayed on a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 15
    Anki

    Anki

    Anki is a smart spaced repetition flashcard program

    Anki is a free, open-source spaced repetition flashcard application designed for efficient long‑term memorization. It supports a wide variety of media types (text, images, audio, LaTeX), advanced scheduling algorithms (SM‑2, FSRS), and extensibility via add‑ons. It’s widely used for education, language learning, medical training, and more.
    Downloads: 33 This Week
    Last Update:
    See Project
  • 16
    Museeks

    Museeks

    A simple, clean and cross-platform music player

    A simple, clean and cross-platform music player. Museeks is on its way to a big rewrite with some major UI changes, please help shape the future of the music player in the discussions section! You will not find tons of features, as its goals is not to compete with more complete and more famous music players. Museeks is currently in development. This implies some things can break after an update (database schemes changes, config...).
    Downloads: 14 This Week
    Last Update:
    See Project
  • 17
    Bili23 Downloader

    Bili23 Downloader

    Cross platform GUI tool for downloading videos from Bilibili sites

    Bili23-Downloader is an open source desktop application designed for downloading video content from the Bilibili platform. It provides a graphical interface that allows users to download various types of media including user-uploaded videos, series episodes, movies, and other hosted content. It focuses on ease of use with a zero-configuration setup, making it accessible to both beginners and experienced users. It supports high performance downloads through multi-threading and includes resume...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 18
    Orange Juice Audio Enhancer for Linux
    Audio Enhancer for Linux by MX Linux's Freja. (Development time spents 6 years). It's GPLv3. 22 Nov 2025 - Latest "Super Headphones.json" is a perfect evolve style of "Orange Juice Mild". Super style for Headphones Music Listenning for extrox (MX-25 based or later) family, wow! ____________ Most recommended Default is "Orange Juice Mild" .
    Downloads: 9 This Week
    Last Update:
    See Project
  • 19
    media-chrome

    media-chrome

    Custom elements (web components) for making audio and video player

    media-chrome is an open source library that provides fully customizable media player controls using native web components, allowing developers to design consistent and flexible audio and video player interfaces across different platforms and frameworks. Instead of relying on default browser controls or proprietary player APIs, Media Chrome introduces a set of reusable custom elements that can be composed using standard HTML, styled with CSS, and integrated into any JavaScript framework...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Butterchurn

    Butterchurn

    Butterchurn is a WebGL implementation of the Milkdrop Visualizer

    Butterchurn is a WebGL-based music visualization engine that recreates the classic MilkDrop visualizer experience entirely in the browser using modern web technologies. It is designed to render complex, real-time audio-reactive graphics that respond dynamically to music input, producing highly immersive and fluid visual effects. The engine uses GPU acceleration through WebGL to achieve high performance, allowing it to handle intricate shader-based visualizations without overwhelming system...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    AudioMuse-AI

    AudioMuse-AI

    AudioMuse-AI is an Open Source Dockerized environment

    AudioMuse-AI is an open-source system designed to automatically generate playlists and analyze music libraries using artificial intelligence and audio signal processing techniques. The platform runs locally in a Dockerized environment and performs detailed sonic analysis on audio files to understand characteristics such as tempo, mood, and acoustic similarity. By analyzing the underlying audio content rather than relying on external metadata services, the system can organize large personal...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ChatTTS_colab is a wrapper project around the ChatTTS model that focuses on “one-click” deployment, especially in Google Colab. It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha”...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 24
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    TRIBE v2 is a multimodal foundation model developed by Meta AI for predicting human brain activity from naturalistic stimuli such as video, audio, and text. It is designed for in-silico neuroscience, enabling researchers to model how the brain responds to complex real-world inputs. The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    Real-Time Voice Cloning

    Real-Time Voice Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that...
    Downloads: 11 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB