Showing 6597 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Descent 3

    Descent 3

    Descent 3 by Outrage Entertainment

    ...It provides the full C and C++ engine source code, including the historically significant “1.5” patch that was previously created by developers and later stabilized by fans. The codebase covers the game’s rendering, physics, audio, networking, tools, and editor components, allowing enthusiasts to build, run, and modify the classic 6-degrees-of-freedom space shooter on modern systems. To actually play the game, users must supply their own original game assets, following instructions in the repository’s usage documentation. The project uses CMake and related modern tooling for cross-platform builds, with support for Linux and Windows among other environments. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    projectM

    projectM

    Cross-platform Music Visualization Library

    Cross-platform Music Visualization Library. Open-source and Milkdrop-compatible. Experience psychedelic and mesmerizing visuals by transforming music into equations that render a limitless array of user-contributed visualizations. projectM is an open-source project that reimplements the esteemed Winamp Milkdrop by Geiss in a more modern, cross-platform reusable library. Its purpose in life is to read an audio input and to produce mesmerizing visuals, detecting tempo, and rendering advanced...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 3
    Portkey AI Gateway

    Portkey AI Gateway

    A blazing fast AI Gateway with integrated guardrails

    Portkey AI Gateway aims to offer a blazing fast, secure, and flexible gateway for interacting with a wide variety of models and enforcing guardrails. It presents a single, friendly API through which you can route to 200+ LLMs, while applying configurable input/output guardrails to enforce policies or restrict certain content. It supports automatic retries, fallbacks, load balancing across providers or keys, and request timeouts to avoid latency spikes. The gateway is multimodal: it can...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control...
    Downloads: 11 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    PowerPoint-ist

    PowerPoint-ist

    Web presentation editor replicating many PowerPoint features online

    PPTist is a web-based presentation editing application designed to replicate many of the commonly used features found in traditional slide presentation software. It allows users to create, edit, and present slide decks directly within a web browser while maintaining a desktop-like editing experience. PPTist is built with Vue 3 and TypeScript and focuses on providing a highly interactive slide editing environment with extensive customization and extension potential. PPTist supports a wide...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 7
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9

    AUDio MEasurement System

    PC based Oscilloscope and Spectrum analyzer using sound card

    AUDio MEasurement System - a multi-platfrom system for audio measurement through sound card in the PC. It contains: generator, oscilloscope, audio spectrum analyzer (FFT) and frequency sweep plot. Compiles and works under Linux, Windows and MacOS. Source code is available in "git" and as ZIP snapshot. For more information see README.md
    Leader badge
    Downloads: 63 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Nextcloud Server

    Nextcloud Server

    A safe home for all your data

    Nextcloud server is a free and open source server software that allows you to store all of your data in a server of your choosing. With Nextcloud you can easily access and store data in the data center you trust, sync data among various devices, and share your data for collaboration purposes. It offers the best security in the self hosted file sync and share world, and is expandable with hundreds of apps.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 11
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 12
    OpenAI Python

    OpenAI Python

    The official Python library for the OpenAI API

    The OpenAI Python library provides convenient access to the OpenAI REST API from any Python 3.7+ application. The library includes type definitions for all request params and response fields, and offers both synchronous and asynchronous clients powered by httpx.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    yt-dlp

    yt-dlp

    A youtube-dl fork with additional features and fixes

    yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keeping up to date with the original project
    Downloads: 618 This Week
    Last Update:
    See Project
  • 14
    extrox - MX Linux Based Distro

    extrox - MX Linux Based Distro

    Art book meets Audio Filter

    ♬ Music Competition extrox Cup 2026 Extended Deadline until to "1 May"! https://note.com/nice_ferret975/n/nc3b6ae11199a extrox Distrowatch page - https://distrowatch.com/extrox Latest info - https://extrox.com/features extrox arch Important Information for user- https://extrox.com/lab Information: "Sound Heaven" ISO provided as "operating environment" for audio filters "Sound Heaven Alpha" and "Super Headphones Alpha" that allow you to enjoy music to the...
    Leader badge
    Downloads: 294 This Week
    Last Update:
    See Project
  • 15
    hfapigo

    hfapigo

    Unofficial (Golang) Go bindings for the Hugging Face Inference API

    (Golang) Go bindings for the Hugging Face Inference API. Directly call any model available in the Model Hub. An API key is required for authorized access. To get one, create a Hugging Face profile.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Spring AI Alibaba Examples

    Spring AI Alibaba Examples

    Spring AI Alibaba examples for building and testing AI apps

    Spring AI Alibaba Examples provides a collection of example projects that demonstrate how to use Spring AI and Spring AI Alibaba across different scenarios, from basic setups to more advanced AI applications. It is designed to help developers understand core concepts, explore practical implementations, and follow best practices when building AI-powered systems using the Spring ecosystem. Each module focuses on a specific use case such as chat, image processing, audio handling, graph...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Live API Web Console

    Live API Web Console

    A react-based starter app for using the Live API over websockets

    Live API Web Console is a React starter that demonstrates how to use Gemini’s Live API over WebSockets to build real-time, multimodal experiences. The app includes modules for streaming audio playback, recording user media from the microphone, webcam, or even screen capture, and it surfaces a unified event log so you can debug the session as it flows. Configuration lives in a simple .env file and the project boots with standard web tooling, letting you experiment quickly with models, system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Membrane Core

    Membrane Core

    The core of Membrane Framework, multimedia processing framework

    membrane_core is the foundation of the Membrane multimedia framework for Elixir, providing the abstractions and runtime needed to build real-time audio and video pipelines. It models media processing as a graph of lightweight, supervised OTP processes—elements connected by links—so work is isolated, fault-tolerant, and easy to scale or reconfigure at runtime. The core defines a clear lifecycle and callback API for elements, plus concepts like buffers, events, and capabilities/format...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    RtspSimpleServer

    RtspSimpleServer

    ready-to-use RTSP / RTMP / LL-HLS / WebRTC server and proxy

    ...Reload the configuration without disconnecting existing clients (hot reloading) Read Prometheus-compatible metrics. Run external commands when clients connect, disconnect, read or publish streams. Natively compatible with the Raspberry Pi Camera. Compatible with Linux, Windows and macOS, does not require any dependency or interpreter.
    Downloads: 55 This Week
    Last Update:
    See Project
  • 20
    OpenAI

    OpenAI

    Swift community driven package for OpenAI public API

    MacPaw OpenAI is a community-driven Swift SDK that provides developers with a structured and type-safe way to interact with the OpenAI API and compatible providers within Apple ecosystem applications. It simplifies the integration of AI capabilities into iOS, macOS, and other Swift-based applications by offering a clean abstraction over the underlying REST API, enabling developers to focus on functionality rather than low-level implementation details. The SDK supports a wide range of...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 22
    II ElevenLabs UI

    II ElevenLabs UI

    Component library and custom registry built on top of shadcn/ui

    ElevenLabs UI is an open-source component library designed to accelerate the development of multimodal AI applications, particularly those involving voice agents and audio-based interactions. Built on top of modern frontend tooling such as React, Tailwind CSS, and shadcn/ui, it provides a collection of pre-built, customizable components that developers can easily integrate into their applications. The library includes specialized UI elements such as audio players, waveform visualizers,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TADA

    TADA

    Open Source Speech Language Model

    TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    OpenFL

    OpenFL

    Open source library for creative expression on the web, desktop, etc.

    OpenFL is a free and open-source, cross-platform software framework that empowers developers to create rich interactive applications and games using a single codebase that can run on web browsers, mobile devices, desktops, and even some consoles. It builds on the Haxe programming language and offers a familiar display list and event-driven API inspired by classic Adobe Flash and AIR, allowing developers to leverage well-known paradigms while targeting modern platforms. OpenFL supports 2D and...
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB