Showing 6597 open source projects for "audio linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Axmol Engine

    Axmol Engine

    Multi-platform Engine for Desktop, XBOX (UWP) and Mobile games

    Axmol is a modern C++ game engine forked from Cocos2d-x, designed to support high-performance 2D and lightweight 3D game development across multiple platforms. It improves upon the original Cocos2d-x with a cleaner architecture, better tooling, and support for modern C++ standards. Axmol supports scripting with Lua and JavaScript, and is suitable for both indie developers and studios targeting mobile, desktop, and web platforms. With an active community and frequent updates, Axmol is a solid...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Toxic

    Toxic

    A Tox-based instant messaging and video chat client

    Toxic is a Tox protocol-based command-line messenger for Linux, offering fully encrypted, decentralized, and serverless instant messaging. It includes features such as text messaging, file transfer, and audio calls—all without requiring user accounts or central servers. Toxic is designed for power users who prefer a minimalist interface, operating entirely within a terminal while delivering strong privacy through the Tox peer-to-peer protocol.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    RSS to Telegram Bot

    RSS to Telegram Bot

    A Telegram RSS bot that cares about your reading experience

    A Telegram RSS bot that cares about your reading experience.
    Downloads: 5 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    SoX is the Swiss Army Knife of sound processing utilities. It can convert audio files to other popular audio file types and also apply sound effects and filters during the conversion.
    Leader badge
    Downloads: 24,730 This Week
    Last Update:
    See Project
  • 6
    AMP

    AMP

    Web component framework for building ads, emails, websites and more

    AMP is an open source web component framework that allows you to easily create user-first websites, ads, emails, stories and more. AMP creates fast, smooth-loading web pages that prioritize the user-experience, consistently providing a fast experience across all devices and platforms.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    E2B

    E2B

    Secure open source cloud runtime for AI apps & AI agents

    E2B's Code Interpreter SDK allows you to add code-interpreting capabilities to your AI apps. E2B Sandbox is a secure sandboxed cloud environment made for AI agents and AI apps. Sandboxes allow AI agents and apps to have long-running cloud secure environments. In these environments, large language models can use the same tools as humans do.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Decky Loader

    Decky Loader

    A plugin loader for the Steam Deck

    Decky Loader is a homebrew plugin launcher built for the Steam Deck that enables users to extend and customize the console’s functionality through a dynamic plugin ecosystem integrated directly into the SteamOS interface. It acts as a middleware layer that injects and manages plugins, allowing deep customization such as modifying UI elements, changing system sounds, adjusting display properties, and extending system behavior beyond stock capabilities. The platform is designed to persist...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    Polyglot

    Polyglot

    Cross-platform AI language practice app

    Polyglot is a cross platform AI language practice application that runs as a desktop app and also offers a web version. It is built around conversational large language models and Azure based text to speech services, turning them into an interactive environment for speaking practice in multiple languages. Users can define custom AI personas, choose languages, and configure their own OpenAI and Azure keys so they retain control over which backends they use. The app supports speech recognition...
    Downloads: 20 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    YoutubeExplode

    YoutubeExplode

    Abstraction layer over YouTube's internal API

    YoutubeExplode is a .NET library that provides a high-level abstraction for interacting with YouTube data, enabling developers to retrieve metadata and download media streams programmatically. The project exposes a clean API that allows applications to query videos, playlists, channels, and search results without relying on the official YouTube Data API. Under the hood, the library parses raw page data and leverages reverse-engineered internal endpoints to obtain structured information and...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    MLT Multimedia Framework

    MLT Multimedia Framework

    MLT Multimedia Framework

    Author, manage, and run multitrack audio/video compositions. The engine of a non-linear video editor that can be used in all sorts of apps, not just desktop video editors. MLT is an open source multimedia framework, designed and developed for television broadcasting. It provides a toolkit for broadcasters, video editors, media players, transcoders, web streamers and many more types of applications. The functionality of the system is provided via an assortment of ready-to-use tools, XML...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    Etherpad

    Etherpad

    A real-time collaborative document editor for the web

    Etherpad is a highly customizable online document editor that allows for collaborative editing in real-time by up to thousands of real-time users. With Etherpad, you don’t have to send documents back and forth-- simply set it up, share the link and collaborate with co-workers, fellow students, or friends on just about any written document! Etherpad provides all-access to data through a well-documented API and provides support for data export/import capabilities. It’s got an awesome set of...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    mGBA

    mGBA

    mGBA Game Boy Advance Emulator

    mGBA is a high-performance, open-source emulator designed to accurately replicate the behavior of the Game Boy Advance hardware while maintaining fast execution across a wide range of devices. The project was created with the goal of improving both accuracy and speed compared to earlier emulators, achieving a balance that allows games to run reliably even on lower-end systems. It supports not only Game Boy Advance titles but also Game Boy and Game Boy Color games, making it a versatile...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS:...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    TensorBoardLogger.jl

    TensorBoardLogger.jl

    Easy peasy logging to TensorBoard with Julia

    TensorBoardLogger.jl is a native library for logging arbitrary data to Tensorboard, extending Julia's standard Logging framework. It can also be used to deserialize TensoBoard's .proto files. The fundamental type defined in this package is a TBLogger, which behaves like other standard loggers in Julia such as ConsoleLogger or TextLogger. You can create one by passing it the path to the folder where you want to store the data. You can also pass an optional second argument to specify the...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    MegaTTS 3

    MegaTTS 3

    Official PyTorch Implementation

    MegaTTS3 is an open-source text-to-speech (TTS) and voice-cloning system from ByteDance that aims to deliver high-quality, expressive speech synthesis, including zero-shot voice cloning of previously unseen speakers. Its backbone is a lightweight diffusion-transformer (on the order of ~0.45 B parameters), which enables efficient inference while still producing high-fidelity audio. Given a reference audio sample (and corresponding latent representation), MegaTTS3 can generate speech in the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Git Large File Storage

    Git Large File Storage

    Git extension for versioning large files

    An open source Git extension for versioning large files. Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise. Download and install the Git command line extension. Once downloaded and installed, set up Git LFS for your user account. In each Git repository where you want to use Git LFS, select the file types you'd like Git...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 18
    Docling

    Docling

    Get your documents ready for gen AI

    Docling is an open-source document processing toolkit built to prepare diverse content types for modern generative AI and data workflows. The project focuses on converting and parsing many document formats into a unified structured representation that downstream systems can easily consume. It supports advanced PDF understanding, including layout detection, table extraction, and reading order analysis, enabling high-fidelity document intelligence pipelines. Docling is designed to run...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    Fish Folk: Jumpy

    Fish Folk: Jumpy

    Tactical 2D shooter in fishy pixels style. Made with Rust-lang

    Fish Folk: Jumpy is a tactical 2D shooter, played by up to 4 players online or on a shared screen. Aim either left or right; the rest is up to clever movement and positioning in this fish-on-fish brawler. Jumpy runs in the browser. You can play a web demo to try out the game, without needing to install anything on your computer. We recommend using the Chrome browser or other derivatives for best performance, or if you have issues with other browsers.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    GuitarPedal

    GuitarPedal

    Linus learns analog circuits

    GuitarPedal is an experimental repository exploring a digital guitar-effects signal chain implemented with lean, low-level code. The project demonstrates how to read audio input, process it through simple transformations, and write the result out in real time with minimal latency. It emphasizes straightforward, inspectable DSP so developers can follow the math and tweak parameters without a giant framework in the way. The codebase favors portability and simplicity, focusing on a handful of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    VMZ (Video Model Zoo)

    VMZ (Video Model Zoo)

    VMZ: Model Zoo for Video Modeling

    The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Sapiens

    Sapiens

    High-resolution models for human tasks

    Sapiens is a research framework from Meta AI focused on embodied intelligence and human-like multimodal learning, aiming to train agents that can perceive, reason, and act in complex environments. It integrates sensory inputs such as vision, audio, and proprioception into a unified learning architecture that allows agents to understand and adapt to their surroundings dynamically. The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LaiNES

    LaiNES

    Compact cycle-accurate NES emulator

    LaiNES is a compact, cycle-accurate Nintendo Entertainment System emulator written in C++ that prioritizes precision and minimalism in its implementation. Its design focuses on accurately simulating the NES hardware at the clock-cycle level, ensuring that timing-sensitive behaviors and edge cases are faithfully reproduced. Despite its relatively small codebase, it supports a wide range of cartridge mappers, enabling compatibility with a large portion of NES games. The emulator includes a...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    Transcoder

    Transcoder

    Hardware-accelerated video transcoding using Android MediaCodec APIs

    Transcoder by DeepMedia is an AI-powered video-to-video speech translation engine that enables fully automated multilingual dubbing. Unlike traditional speech translation systems that rely on multi-stage pipelines, Transcoder directly translates one speaker’s video into another language while preserving facial expressions, lip-sync, and vocal identity. Designed for real-time use and production-grade pipelines, Transcoder combines advanced deep learning models with GPU acceleration to deliver...
    Downloads: 8 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB