Showing 6596 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    xrdp

    xrdp

    An open source RDP server

    ...Most Linux distributions should distribute the latest release of xrdp in their repository.
    Downloads: 57 This Week
    Last Update:
    See Project
  • 2
    miniaudio

    miniaudio

    Audio playback and capture library written in C,

    miniaudio is written in C with no dependencies except the standard library and should compile cleanly on all major compilers without the need to install any additional development packages. All major desktop and mobile platforms are supported. miniaudio gives you complete flexibility. With the low-level API, just initialize a connection to the device and send or receive raw audio data. The modular design of miniaudio allows you to use the low-level API without compromising your ability to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    WhatSie

    WhatSie

    Feature rich WhatsApp Client for Desktop Linux

    Feature-rich WhatsApp web client based on Qt WebEngine for Linux Desktop.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    It's MyTabs

    It's MyTabs

    Open source, web based, self hostable guitar/bass tab viewer

    It’s MyTabs is an open-source, web-based and self-hostable guitar/bass tablature viewer and player, built to give musicians their own alternative to subscription services like Songsterr or Soundslice. Users can upload tab files in formats such as GP, GPX, MusicXML, or CAPX, sync them with audio or YouTube videos, and play them back in a browser or mobile device. It supports features like MIDI synth track muting/soloing, mobile-friendly UI, dark/light themes, and a variety of cursor modes...
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    OpenAI .NET

    OpenAI .NET

    The official .NET library for the OpenAI API

    OpenAI .NET is the official client library for calling the OpenAI REST API from C# and other .NET languages, with first-class support for modern .NET patterns. It provides strongly typed clients across API areas (chat, audio, images, embeddings, moderations, batches, files, models, vector stores, responses, realtime, assistants) and works with .NET Standard 2.0 while the examples use .NET 8. You install it via NuGet and authenticate with an API key, ideally through environment variables or...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    MusicGPT

    MusicGPT

    Generate music based on natural language prompts using LLMs

    ...Instead, it provides a lightweight environment capable of executing music generation models locally on CPUs or GPUs while maintaining strong performance across operating systems including Windows, macOS, and Linux. Users can describe a musical style, mood, or instrumentation using text prompts, and the system produces original audio samples based on those instructions. The application currently integrates with models such as MusicGen and is designed to support additional models transparently in the future. In addition to a command-line interface, the project includes a web-based interface that enables conversational interaction with the AI model.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Invidious

    Invidious

    Invidious is an alternative front-end to YouTube

    An open source alternative front-end to YouTube. Lightweight, no ads, no tracking, no JavaScript required, Light/Dark themes. Customizable homepage, subscriptions, independent from Google, notifications for all subscribed channels. Audio-only mode (with background play on mobile), support for Reddit comments, available in many languages, thanks to our translators. Invidious protects you from the prying eyes of Google. It won't track you either! Invidious helps you regain focus through a...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 8
    SerenityOS

    SerenityOS

    The Serenity Operating System

    SerenityOS is an open source Unix-like operating system project with its own custom kernel, graphical user interface, system libraries, and userland tools. It combines a nostalgic “90s UI aesthetic” with modern system capabilities: a preemptive, multi-threaded kernel, own browsers, network stack, file systems, IPC, security features, and a suite of graphical / developer applications. The project is both a hobbyist OS and a polished engineering sandbox.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 9
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    baru

    baru

    A system monitor written in Rust and C

    Baru gathers the information from /sys and /proc filesystems (filled by the kernel). Except for audio and network modules that use C libraries. There is no memory leak over time. All modules are threaded. Thanks to this design (as well as Rust and C), baru is lightweight and efficient. It can run at a high refresh rate with a minimal processor footprint. The audio module communicates with the PipeWire/PulseAudio server through client API to retrieve its data. Wireless and wired modules use...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Tauon

    Tauon

    The music player of today

    Tauon is a modern, streamlined music player app that's packed with features! An emphasis on playlists and drag-and-drop importing puts you in control of your music library. Faded volume control, 24-bit FLAC support, and gapless playback provide the ultimate listening experience. Excellent CUE sheet support, an original smart playlist system, and network playback from koel or Airsonic servers. Last.fm, Listenbrainz, and Maloja scribbling. MPRIS2 support for desktop integration. Tauon is a...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 12
    SenseVoice

    SenseVoice

    Multilingual speech recognition and audio understanding model

    SenseVoice is a speech foundation model designed to perform multiple voice understanding tasks from audio input. It provides capabilities such as automatic speech recognition, spoken language identification, speech emotion recognition, and audio event detection within a single system. SenseVoice is trained on more than 400,000 hours of speech data and supports over 50 languages for multilingual recognition tasks. It is built to achieve high transcription accuracy while maintaining efficient...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    ElevenLabs Python

    ElevenLabs Python

    The official Python SDK for the ElevenLabs API

    elevenlabs-python is the official Python SDK for the ElevenLabs API, giving developers a convenient way to access ElevenLabs’ high-quality, lifelike voices. The library wraps the HTTP API into a typed Python client, so you can perform text-to-speech, streaming, voice cloning, voice management, and agents-related operations with simple method calls. It exposes ElevenLabs’ main models such as Eleven Multilingual v2, Eleven Flash v2.5, and Eleven Turbo v2.5, each targeting different trade-offs...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    OpenCorePkg

    OpenCorePkg

    OpenCore bootloader

    OpenCorePkg is an open-source, modular UEFI (Unified Extensible Firmware Interface) bootloader and development framework, primarily designed to enable macOS booting on non-Apple hardware (Hackintosh). It includes Apple-specific UEFI drivers, utilities for macOS installation support, and shared libraries used across Acidanthera projects. Apple disk image loading support. Apple keyboard input aggregation. Apple PE image signature verification. Apple UEFI secure boot supplemental code. Audio...
    Downloads: 158 This Week
    Last Update:
    See Project
  • 16
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 17
    Qwen2.5-Omni

    Qwen2.5-Omni

    Capable of understanding text, audio, vision, video

    Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Nextcloud Talk

    Nextcloud Talk

    Video- & audio-conferencing app for Nextcloud

    Nextcloud Talk is the official chat, video and audio conferencing app for Nextcloud that allows users to chat, call and screenshare with multiple other users. Nextcloud offers better protection for your communication as it provides end-to-end encryption and keeps even metadata from leaking. You can have private, group, public or password protected calls by simply inviting one person, a whole group, or sending a public link as an invitation to a call. It is also conveniently integrated with...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 20

    ravr

    Advanced audio player with DSP features

    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    od4knb Linux

    od4knb Linux

    Linux® ISO's, Apps, Scripts and Howtos by Odie Pastunik

    The od4knb antiXradio distribution is based on antiX Linux® and runs very smooth on an old i386 (32 bit) computer. It contains a lot of dutch radio stations. You'll also find a lot of Howtos for novice Linux users on this site! [!] To continue, click on the Files tab, and scroll down. [Translate this site - Vertaal deze site - Traduire ce site - このサイトを翻訳する]...
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Harmonoid

    Harmonoid

    Plays & manages your music library. Looks beautiful & juicy

    Plays & manages your music library. Looks beautiful & juicy. Playlists, visuals, synced lyrics, pitch shift, volume boost & more.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Plyr

    Plyr

    Simple HTML5, YouTube and Vimeo player

    A simple, accessible and customizable media player for HTML5 Video, HTML5 Audio, YouTube and Vimeo. Premium video monetization from Video Intelligence. Plyr is a simple, lightweight, accessible and customizable HTML5, YouTube and Vimeo media player that supports modern browsers. Accessible - full support for VTT captions and screen readers. Customizable - make the player look how you want with the markup you want. Responsive - works with any screen size. Monetization - make money from your...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 25
    OpenAI-Compatible Edge-TTS API

    OpenAI-Compatible Edge-TTS API

    Free, high-quality text-to-speech API endpoint to replace OpenAI

    OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB