Showing 6656 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    BizHawk

    BizHawk

    BizHawk is a multi-system emulator written in C#

    A multi-system emulator written in C#. As well as quality-of-life features for casual players, it also has recording/playback and debugging tools, making it the first choice for TASers (Tool-Assisted Speedrunners). Screenshotting and recording audio + video to file. Firmware management, input, framerate, and more in a HUD over the game. Rebindable hotkeys for controlling the frontend (keyboard+mouse+gamepad). A comprehensive input mapper for the emulated gamepads and other peripherals....
    Downloads: 51 This Week
    Last Update:
    See Project
  • 2
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 3
    Buster

    Buster

    Captcha solver extension for humans

    ...The success rate of the extension can be improved by simulating user interactions with the help of a client app. Follow the instructions from the extension's options to download and install the client app on Windows, Linux and macOS, or get the app from this repository.
    Downloads: 53 This Week
    Last Update:
    See Project
  • 4
    KrillinAI

    KrillinAI

    Video translation and dubbing tool powered by LLMs

    KrillinAI is an end-to-end content localization, translation, and dubbing tool aimed at helping creators transform videos into multiple languages with minimal manual effort. It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed...
    Downloads: 17 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    FeelUOwn

    FeelUOwn

    Trying to be a robust, user-friendly and hackable music player

    FeelUOwn is a user-friendly, and hackable music player.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Fish Speech

    Fish Speech

    SOTA Open Source TTS

    Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 7
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. It is designed to help...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    bfxr

    bfxr

    Flash + AIR sound effects generator. Based on Sfxr.

    The bfxr project by increpare is a sound-effects generator tool originally built using Flash + AIR, based on the earlier Sfxr project. Its purpose is to enable users, especially game developers and sound designers, to quickly generate retro, 8-bit/“chiptune” style sound effects (“bleeps”, “booms”, “zaps”, etc.) without deep knowledge of audio signal processing. It offers an interactive GUI through which you can tweak many parameters (oscillators, envelopes, filters, etc.) to sculpt custom...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    WavTokenizer

    WavTokenizer

    SOTA discrete acoustic codec models with 40/75 tokens per second

    WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    xrdp

    xrdp

    An open source RDP server

    ...Most Linux distributions should distribute the latest release of xrdp in their repository.
    Downloads: 61 This Week
    Last Update:
    See Project
  • 11
    Furnace

    Furnace

    A multi-system chiptune tracker compatible with DefleMask modules

    Furnace is a powerful multi-system chiptune tracker that enables users to compose music using the sound chips of classic computers, consoles, and arcade hardware. It supports an extensive range of audio chips, including FM synthesis, wavetable synthesis, and sample-based systems, making it one of the most versatile trackers available. The software is compatible with multiple operating systems and can be used both as a standalone application and as a development tool for retro-style audio...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Markdownify MCP Server

    Markdownify MCP Server

    Convert files and web content into clean, usable Markdown easily

    Markdownify MCP is a Model Context Protocol server that converts many types of files and web content into clean Markdown. It supports formats such as PDFs, images, audio with transcription, DOCX, XLSX, and PPTX, along with web sources like YouTube transcripts, Bing results, and general webpages. Markdownify MCP is designed to simplify content extraction and make data easier to read, share, and reuse in structured workflows. Developers can install dependencies, build, and run the server...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    SDL

    SDL

    Simple DirectMedia Layer

    SDL (Simple DirectMedia Layer) is a cross-platform multimedia development library designed to provide low-level access to hardware components such as graphics, audio, input devices, and system resources, making it a foundational tool for building games, emulators, and interactive applications. It abstracts platform-specific functionality into a consistent API, allowing developers to write code once and deploy it across multiple operating systems including Windows, macOS, Linux, iOS, and Android. ...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 14
    OpenVoice

    OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model

    OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 15
    LiveAvatar

    LiveAvatar

    Streaming Real-time Audio-Driven Avatar Generation

    LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions. The project co-designs algorithms and system...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    DistroAV

    DistroAV

    DistroAV (formerly OBS-NDI): NDI integration for OBS Studio

    ...The plugin works with the NDI runtime and supports modern versions of OBS Studio across Windows, macOS, and Linux, enabling streamers and production setups to leverage low-latency, high-quality media over local networks for multi-device collaboration or extended setups. DistroAV’s development community focuses on maintaining compatibility with the latest OBS releases and NDI v6 tooling while addressing bugs and extending feature requests.
    Downloads: 46 This Week
    Last Update:
    See Project
  • 17
    WhatSie

    WhatSie

    Feature rich WhatsApp Client for Desktop Linux

    Feature-rich WhatsApp web client based on Qt WebEngine for Linux Desktop.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 18
    Linux Studio Plugins Project

    Linux Studio Plugins Project

    Linux Studio Plugins Project

    LSP (Linux Studio Plugins) is a collection of open-source plugins currently compatible with LADSPA, LV2 and LinuxVST formats. Standalone plugins for JACK are provided since version 1.0.8. Experimental support of ARMv7 added since version 1.1.4 Experimental support of AArch64 added since version 1.1.9 Decomposition of modules and new UI introduced in 1.2.0 Added CLAP support in 1.2.5 Added VST3 support in 1.2.15 The basic idea is to fill the lack of good and useful plugins under the...
    Leader badge
    Downloads: 125 This Week
    Last Update:
    See Project
  • 19
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 20
    OpenAI .NET

    OpenAI .NET

    The official .NET library for the OpenAI API

    OpenAI .NET is the official client library for calling the OpenAI REST API from C# and other .NET languages, with first-class support for modern .NET patterns. It provides strongly typed clients across API areas (chat, audio, images, embeddings, moderations, batches, files, models, vector stores, responses, realtime, assistants) and works with .NET Standard 2.0 while the examples use .NET 8. You install it via NuGet and authenticate with an API key, ideally through environment variables or...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Amazon Chime SDK for JavaScript

    Amazon Chime SDK for JavaScript

    A JavaScript client library for integrating multi-party communications

    The Amazon Chime SDK is a set of real-time communications components that developers can use to quickly add messaging, audio, video, and screen sharing capabilities to their web or mobile applications. Developers can build on AWS's global communications infrastructure to deliver engaging experiences in their applications. For example, they can add video to a health application so patients can consult remotely with doctors on health issues, or create customized audio prompts for integration...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Note67

    Note67

    A private, local meeting notes assistant

    note67 is a private, local meeting notes assistant application that combines audio capture, transcription, and AI-powered summarization to help users document conversations and meetings on their own devices without relying on cloud services. Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI,...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenCorePkg

    OpenCorePkg

    OpenCore bootloader

    OpenCorePkg is an open-source, modular UEFI (Unified Extensible Firmware Interface) bootloader and development framework, primarily designed to enable macOS booting on non-Apple hardware (Hackintosh). It includes Apple-specific UEFI drivers, utilities for macOS installation support, and shared libraries used across Acidanthera projects. Apple disk image loading support. Apple keyboard input aggregation. Apple PE image signature verification. Apple UEFI secure boot supplemental code. Audio...
    Downloads: 179 This Week
    Last Update:
    See Project
  • 25
    SerenityOS

    SerenityOS

    The Serenity Operating System

    SerenityOS is an open source Unix-like operating system project with its own custom kernel, graphical user interface, system libraries, and userland tools. It combines a nostalgic “90s UI aesthetic” with modern system capabilities: a preemptive, multi-threaded kernel, own browsers, network stack, file systems, IPC, security features, and a suite of graphical / developer applications. The project is both a hobbyist OS and a polished engineering sandbox.
    Downloads: 29 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB