Showing 6597 open source projects for "audio linux"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1

    Focusrite Scarlett Controller Linux

    Focusrite Scarlett Controller Linux

    A lightweight, open-source mixer GUI for Focusrite Scarlett USB audio interfaces on Linux. Built with Free Pascal and Lazarus, it gives you full control over your Scarlett's internal DSP mixer without needing Focusrite Control or Wine. What it does: It talks directly to the ALSA driver — every mix bus volume, input setting, routing enum, and hardware switch is exposed in an organized tree view with collapsible categories.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    LWJGL

    LWJGL

    Java library that enables cross-platform access to popular native APIs

    LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL) and parallel computing (OpenCL) applications. This access is direct and high-performance, yet also wrapped in a type-safe and user-friendly layer, appropriate for the Java ecosystem. LWJGL is an enabling technology and provides low-level access. It is not a framework and does not provide higher-level utilities than what the native libraries...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 3
    kew

    kew

    Music for the Shell

    KEW (short for Key-Enabled Wallet) is an open-source command-line wallet and key management tool built for modern blockchain and Web3 workflows, designed to give developers and active users a secure and flexible way to manage cryptographic keys, accounts, and signing operations from the terminal. It focuses on simplicity, reproducibility, and composability, letting users manage multiple wallets, derive keys from mnemonics, and perform signing for a variety of chain-specific transaction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Sopro TTS

    Sopro TTS

    A lightweight text-to-speech model with zero-shot voice cloning

    Sopro TTS is an open-source text-to-speech (TTS) project that implements a lightweight model capable of producing speech from text with zero-shot voice cloning, meaning it can mimic a speaker’s voice from only a few seconds of reference audio. Built with a 169 million-parameter architecture that uses dilated convolutions and cross-attention layers instead of large Transformer stacks, it achieves relatively fast real-time performance even on CPUs (about a 0.25 real-time factor measured on an...
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Diffusers

    Diffusers

    State-of-the-art diffusion models for image and audio generation

    Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions. State-of-the-art diffusion pipelines that can be run in inference with just a...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Mattermost

    Mattermost

    Mattermost is an open source platform for secure collaboration

    One integrated platform for all of your team messaging, collaborative workflows and project management needs. Work together effectively with real-time communication, file and code snippet sharing, in-line code syntax highlighting, and workflow automation purpose-built for technical teams. Keep everyone on the same page while prototyping your latest innovation, or simply planning sprints or managing production incidents. Execute and automate workflows with flexible, custom integrations with...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8
    Speech Note

    Speech Note

    Speech Note Linux app. Note taking, reading and translating

    Speech Note is a Linux desktop and Sailfish OS application for taking, reading, and translating notes with integrated offline speech technology. It combines speech-to-text, text-to-speech, and machine translation in a single interface, allowing users to dictate notes, listen back to them, and translate them without ever sending data to the cloud. All processing is done locally, which means audio, text, and translations never leave the device, emphasizing strong privacy guarantees. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 9
    Adversarial Robustness Toolbox

    Adversarial Robustness Toolbox

    Adversarial Robustness Toolbox (ART) - Python Library for ML security

    Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, sci-kit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Amazon Connect connect-rtc-js

    Amazon Connect connect-rtc-js

    Provide softphone support to AmazonConnect customers

    connect-rtc.js provides softphone support to AmazonConnect customers when they choose to directly integrate with AmazonConnect API and not use the AmazonConnect web application. It implements Amazon Connect WebRTC signaling protocol and integrates with browser WebRTC APIs to provide a simple contact session interface that can seamlessly integrate with Amazon Connect StreamJS. In a typical amazon-connect-streams integration, connect-rtc-js is not required on parent page. Softphone call...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    linux-file-converter-addon

    linux-file-converter-addon

    Convert various image, audio and video formats from your context menu.

    Convert between various image, audio and video formats using the context menu. The addon is written in Python and available for Nautilus, Nemo, Thunar and Dolphin file viewers. It adds a new option to the context menu to create an easy way to convert between a huge amount of file types. The program offers many options to customize the appearance of its context menu. There are also a few extra formats which can be added by installing optional dependencies. The tool has a built-in auto-update...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    RF24

    RF24

    OSI Layer 2 driver for nRF24L01 on Arduino & Raspberry Pi/Linux

    Optimized high-speed driver for nRF24L01(+) 2.4GHz wireless transceiver. More compliant with the manufacturer specified operation of the chip, while allowing advanced users to work outside the recommended operation. Utilize the capabilities of the radio to their full potential via Arduino. More reliable, responsive, bug-free and feature-rich. Easy for beginners to use, with well-documented examples and features. Consumed with a public interface that's similar to other Arduino standard...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. Developers can compose custom pipelines...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 14
    Canvas LMS

    Canvas LMS

    The open LMS by Instructure, Inc.

    Canvas LMS is a full-featured learning management system designed for K–12, higher-ed, and professional training, with a strong emphasis on usability and openness. Instructors build courses from modular content—pages, assignments, discussions, quizzes—and organize them into learning paths with prerequisites and due dates. Rich grading tools like SpeedGrader streamline assessment with rubrics, inline annotations, and audio/video feedback, while the gradebook supports weighting, outcomes, and...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 15
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    ...AutoSubs is designed with performance in mind, offering efficient processing through a Rust-based backend and supporting multiple operating systems including Windows, macOS, and Linux.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 17
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 18
    DeepSqueak

    DeepSqueak

    DeepSqueak Using Machine Vision to Accelerate Bioacoustics Research

    Using Machine Vision to Accelerate Bioacoustics Research.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ffmpeg.wasm

    ffmpeg.wasm

    FFmpeg for browser, powered by WebAssembly

    ffmpeg.wasm is a pure WebAssembly (and JavaScript/TypeScript) port of FFmpeg that enables in-browser media recording, conversion, and streaming—letting developers perform video/audio processing entirely client-side without server uploads. Transpiled via Emscripten from FFmpeg and its codecs into WebAssembly. Supports both single-threaded and multi-threaded cores using web workers. Written in TypeScript for improved developer experience.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    Peer Calls

    Peer Calls

    Group peer to peer video calls for everyone written in Go

    Peer Calls is a self-hosted, open-source WebRTC-based video and audio calling platform for group communication. Designed for simplicity and privacy, it allows anyone to run their own video conferencing service without relying on third-party providers. Peer Calls supports multi-user rooms, screen sharing, and chat, all delivered via a clean web interface. It’s great for small teams, communities, and educational groups seeking secure and customizable alternatives to mainstream conferencing tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SALMONN family

    SALMONN family

    A suite of advanced multi-modal LLMs

    SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    xgplayer

    xgplayer

    A HTML5 video player with a parser that saves traffic

    xgplayer is a web-friendly, open-source media player library maintained by ByteDance, designed for playing audio/video streams in browsers or web applications with robust control, flexibility, and extensibility. It abstracts many of the lower-level complexities of HTML5 media, providing a consistent API for playback control, custom UI overlays, adaptive streaming, plugin hooks, and cross-browser compatibility. Because of its emphasis on modularity and extensibility, xgplayer can be embedded...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    PlayCanvas Engine

    PlayCanvas Engine

    Fast and lightweight JavaScript game engine built on WebGL and glTF

    PlayCanvas is an open-source game engine. It uses HTML5 and WebGL to run games and other interactive 3D content in any mobile or desktop browser. PlayCanvas is used by leading companies in video games, advertising and visualization such as Animech, Arm, BMW, Disney, Facebook, Famobi, Funday Factory, IGT, King, Miniclip, Leapfrog, Mojiworks, Mozilla, Nickelodeon, Nordeus, NOWWA, PikPok, PlaySide Studios, Polaris, Product Madness, Samsung, Snap, Spry Fox, Zeptolab, Zynga. The PlayCanvas Engine...
    Downloads: 19 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB