Showing 6597 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    ...Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media sentiment, or do other tasks in data science. The system is programmer friendly, ready for creating and running the tools you use to measure and understand your world. In addition to search and GPT tools, you have what you need to take notes, write reports or presentations, record and edit audio or video.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    ScreenPipe

    ScreenPipe

    AI app store powered by 24/7 desktop history. open source

    Screenpipe is an AI app store powered by continuous desktop history recording. It operates entirely locally, offering developers a platform to build, distribute, and monetize AI applications that leverage comprehensive contextual data from users' desktop activities. ​
    Downloads: 24 This Week
    Last Update:
    See Project
  • 4
    Kooha

    Kooha

    Elegantly record your screen

    Capture your screen in an intuitive and straightforward way without distractions. Kooha is a simple screen recorder with a minimal interface. You can simply click the record button without having to configure a bunch of settings.
    Downloads: 19 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    red5-server

    red5-server

    Red5 Server core

    Red5 is an Open Source Flash Server written in Java that supports streaming Video (FLV, F4V, MP4, 3GP). Streaming Audio (MP3, F4A, M4A, AAC) Recording Client Streams (FLV and AVC+AAC in FLV container) Shared objects, live stream publishing, remoting, and protocols: RTMP, RTMPT, RTMPS, and RTMPE.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 6
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    Pixeltable is an open-source Python data infrastructure framework designed to support the development of multimodal AI applications. The system provides a declarative interface for managing the entire lifecycle of AI data pipelines, including storage, transformation, indexing, retrieval, and orchestration of datasets. Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    Generative AI

    Generative AI

    Sample code and notebooks for Generative AI on Google Cloud

    Generative AI is a comprehensive collection of code samples, notebooks, and demo applications designed to help developers build generative-AI workflows on the Vertex AI platform. It spans multiple modalities—text, image, audio, search (RAG/grounding) and more—showing how to integrate foundation models like the Gemini family into cloud projects. The README emphasises getting started with prompts, datasets, environments and sample apps, making it ideal for both experimentation and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 125 This Week
    Last Update:
    See Project
  • 11
    ioquake3

    ioquake3

    The ioquake3 community effort to continue supporting/developing id's

    ...It is designed to let players run Quake 3, its expansion Team Arena, and community mods on contemporary systems while also serving as a solid base for new projects. The engine modernizes the original codebase with a CMake build system, an SDL2 backend for cross-platform windowing and input, and OpenAL sound for better audio quality and multi-speaker setups. It adds numerous quality-of-life improvements such as VoIP support, AVI demo capture, improved console completion and history, and optional Ogg Vorbis support. ioquake3 also improves portability and maintainability by supporting x86_64 on Linux, MinGW builds on Windows, and various other operating systems, and even provides web support via Emscripten.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    mongo-express

    mongo-express

    Web-based MongoDB admin interface, written with Node.js

    A web-based MongoDB admin interface written with Node.js, Express, and Bootstrap 5.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    HY-World 1.5

    HY-World 1.5

    A Systematic Framework for Interactive World Modeling

    HY-WorldPlay is a Hunyuan AI project focusing on immersive multimodal content generation and interaction within virtual worlds or simulated environments. It aims to empower AI agents with the capability to both understand and generate multimedia content — including text, audio, image, and potentially 3D or game-world elements — enabling lifelike dialogue, environmental interpretations, and responsive world behavior. The platform targets use cases in digital entertainment, game worlds,...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    D++

    D++

    C++ Discord API Bot Library - D++ is Lightweight and scalable

    D++ is a lightweight and simple library for Discord written in modern C++. It is designed to cover as much of the API specification as possible and to have an incredibly small memory footprint, even when caching large amounts of data. It is created by the developer of TriviaBot and contributed to by a dedicated team of developers.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 15
    BasedHardware

    BasedHardware

    Open source AI wearable platform for recording and summarizing speech

    Omi is an open source AI wearable platform designed to capture spoken conversations and convert them into useful digital information such as transcripts, summaries, and action items. It combines hardware, firmware, mobile applications, and backend services to create a complete ecosystem for voice-driven interaction. Users can connect the wearable device to a mobile phone and automatically record and transcribe meetings, conversations, and voice memos. Omi includes firmware for wearable...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Agora Flat

    Agora Flat

    Project flat is the Web, Windows and macOS client of Agora Flat

    Agora Flat Open Source Virtual Classroom. Battery included online tutoring tools for teachers & freelance trainers. Build real-time interactive virtual classroom with ease from our open-sourced projects. Agora Flat was born out of an exploratory project in which we investigated customer needs and user experience. After receiving good internal feedback, we believe that Flat has the potential to become an online classroom product that can really help people and have a good user experience. We...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    annyang!

    annyang!

    Speech recognition for your site

    annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Dia

    Dia

    A TTS model capable of generating ultra-realistic dialogue

    Dia is a neural text-to-speech model designed specifically for generating ultra-realistic dialogue in a single pass. Instead of focusing on isolated sentences or flat narration, it is optimized for conversational audio, complete with natural turn-taking, prosody, and pacing. The model can be conditioned on a reference audio sample, allowing you to control emotion, tone, and other stylistic aspects of the speech. It can also produce nonverbal vocalizations like laughter, coughs, clearing the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Aspia

    Aspia

    Remote desktop and file transfer tool

    Free open-source application for real-time desktop remote control and file transfer. With Aspia, you can create your own NAT traversal infrastructure (using Router and Relay servers) with connection by ID or use direct connections. Aspia supports many features. Among them, detailed information about the system, task manager, audio, and text chat. It is safe. All transmitted data is encrypted. Add computers for quick connection, and create computer groups. Encryption of address books with a...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 22
    rtmp-rtsp-stream-client-java

    rtmp-rtsp-stream-client-java

    Library to stream in rtmp and rtsp for Android. All code in Java

    Library for streaming in RTMP and RTSP. All code in Java.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 23
    mpv

    mpv

    Command line video player

    mpv is a free (as in freedom) media player for the command line. It supports a wide variety of media file formats, audio and video codecs, and subtitle types. Powerful scripting capabilities can make the player do almost anything. There is a large selection of user scripts on the wiki. While mpv strives for minimalism and provides no real GUI, it has a small controller on top of the video for basic control. mpv has an OpenGL, Vulkan, and D3D11 based video output that is capable of many...
    Downloads: 69 This Week
    Last Update:
    See Project
  • 24
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 25
    Vibecraft

    Vibecraft

    Manage Claude Code in style

    Vibecraft is a creative AI platform that generates stylized music, beats, and sound textures guided by high-level prompts, allowing musicians and content creators to explore new sonic possibilities without deep expertise in audio synthesis. It uses generative modeling techniques to interpret input descriptors such as genre, mood, tempo, instrument palette, and creative themes, then outputs sequences that can serve as sketches, loops, or full musical ideas. The workflow prioritizes...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB