Showing 6662 open source projects for "audio linux"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Curl

    Curl

    Command line tool and library for transferring data with URLs

    ...It's used in command lines or scripts for transferring data. It's also used in just about every device you can think of: mobile phones and tablets, television sets, printers, routers, media players and other audio equipment. Curl is also the internet transfer backbone for thousands of software applications being used extensively throughout the world today. Curl is feature-rich, thread-safe, well supported and fast. It is also highly portable and works on numerous platforms, including Solaris, NetBSD, FreeBSD, OpenBSD, Linux, Mac OS X, Windows, Darwin, UnixWare, HURD, BeOS, Ultrix, QNX, DOS, Symbian, and many more.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 2
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Descent 3

    Descent 3

    Descent 3 by Outrage Entertainment

    ...It provides the full C and C++ engine source code, including the historically significant “1.5” patch that was previously created by developers and later stabilized by fans. The codebase covers the game’s rendering, physics, audio, networking, tools, and editor components, allowing enthusiasts to build, run, and modify the classic 6-degrees-of-freedom space shooter on modern systems. To actually play the game, users must supply their own original game assets, following instructions in the repository’s usage documentation. The project uses CMake and related modern tooling for cross-platform builds, with support for Linux and Windows among other environments. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Simple DirectMedia Layer

    Simple DirectMedia Layer

    Simple Directmedia Layer

    Simple DirectMedia Layer is a cross-platform development library designed to provide low-level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D. It is used by video playback software, emulators, and popular games including Valve's award-winning catalog and many Humble Bundle games. SDL officially supports Windows, macOS, Linux, iOS, and Android. Support for other platforms may be found in the source code.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    OpenAI Python

    OpenAI Python

    The official Python library for the OpenAI API

    The OpenAI Python library provides convenient access to the OpenAI REST API from any Python 3.7+ application. The library includes type definitions for all request params and response fields, and offers both synchronous and asynchronous clients powered by httpx.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Meetily

    Meetily

    Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper

    This project is a privacy-first AI meeting assistant that captures meeting audio, produces real-time transcripts, and generates summaries while keeping processing entirely on your own machine or infrastructure. It’s built for organizations that want meeting intelligence without sending recordings or transcripts to third-party cloud services, which helps address compliance and data sovereignty requirements. The app supports live transcription with local model options (including Whisper- and...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 7
    MediaPipe Solutions

    MediaPipe Solutions

    Cross-platform, customizable ML solutions

    MediaPipe is an open-source framework developed by Google for building cross-platform machine learning pipelines that process audio, video, and other streaming data in real time. The system provides developers with tools and reusable components that allow them to combine multiple machine learning models with preprocessing and postprocessing logic into efficient perception pipelines. These pipelines can run on a wide variety of platforms including mobile devices, desktop systems, web...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Portkey AI Gateway

    Portkey AI Gateway

    A blazing fast AI Gateway with integrated guardrails

    Portkey AI Gateway aims to offer a blazing fast, secure, and flexible gateway for interacting with a wide variety of models and enforcing guardrails. It presents a single, friendly API through which you can route to 200+ LLMs, while applying configurable input/output guardrails to enforce policies or restrict certain content. It supports automatic retries, fallbacks, load balancing across providers or keys, and request timeouts to avoid latency spikes. The gateway is multimodal: it can...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Speech Note

    Speech Note

    Speech Note Linux app. Note taking, reading and translating

    Speech Note is a Linux desktop and Sailfish OS application for taking, reading, and translating notes with integrated offline speech technology. It combines speech-to-text, text-to-speech, and machine translation in a single interface, allowing users to dictate notes, listen back to them, and translate them without ever sending data to the cloud. All processing is done locally, which means audio, text, and translations never leave the device, emphasizing strong privacy guarantees. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 11
    Phaser HTML5 Game Framework

    Phaser HTML5 Game Framework

    Phaser is a free and fast 2D game framework for making HTML5 games

    Phaser is a popular open-source 2D game framework for making HTML5 games for desktop and mobile platforms. Built with JavaScript and powered by WebGL and Canvas, it offers a robust API for developing everything from arcade to platformer and puzzle games.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Canvas LMS

    Canvas LMS

    The open LMS by Instructure, Inc.

    Canvas LMS is a full-featured learning management system designed for K–12, higher-ed, and professional training, with a strong emphasis on usability and openness. Instructors build courses from modular content—pages, assignments, discussions, quizzes—and organize them into learning paths with prerequisites and due dates. Rich grading tools like SpeedGrader streamline assessment with rubrics, inline annotations, and audio/video feedback, while the gradebook supports weighting, outcomes, and...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 14
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    ...AutoSubs is designed with performance in mind, offering efficient processing through a Rust-based backend and supporting multiple operating systems including Windows, macOS, and Linux.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 15
    Xenia

    Xenia

    Xbox 360 Emulator Research Project

    Xenia is an open-source experimental emulator for the Xbox 360 that aims to let users run Xbox 360 games on Windows and other platforms by reverse-engineering the console’s hardware and firmware behavior in software. It implements the 360’s CPU (Xenon), GPU (including Direct3D shader logic), and system libraries to translate Xbox instructions into equivalent host machine operations, enabling many titles to launch and in some cases play at improved frame rates compared with the original...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 16
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    VMZ (Video Model Zoo)

    VMZ (Video Model Zoo)

    VMZ: Model Zoo for Video Modeling

    The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Web Scrobbler

    Web Scrobbler

    Scrobble music all around the web!

    Web Scrobbler helps music listeners to scrobble their online playback history. Web Scrobbler is a browser extension created for people who listen to music online through their browser, and would like to keep an updated playback history using scrobbling services, such as Last.fm, Libre.fm and ListenBrainz. Download and install the extension for your browser. You can use the download buttons above. Open the extension options, and expand the "Accounts" section, then sign in to a scrobbling...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    RF24

    RF24

    OSI Layer 2 driver for nRF24L01 on Arduino & Raspberry Pi/Linux

    Optimized high-speed driver for nRF24L01(+) 2.4GHz wireless transceiver. More compliant with the manufacturer specified operation of the chip, while allowing advanced users to work outside the recommended operation. Utilize the capabilities of the radio to their full potential via Arduino. More reliable, responsive, bug-free and feature-rich. Easy for beginners to use, with well-documented examples and features. Consumed with a public interface that's similar to other Arduino standard...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 21
    Groq Python

    Groq Python

    The official Python Library for the Groq API

    Groq Python is the official Python SDK for the Groq REST API, giving Python developers straightforward access to Groq’s LLM, chat, audio, and other AI services. Through this library, you can call Groq’s models from Python code — for example to request chat completions, code generation, transcription, or any supported endpoint — using idiomatic Python syntax. The SDK handles authentication (via environment variable or parameter), defines proper type-safe request/response data types, and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS:...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Groq TypeScript / Node.s

    Groq TypeScript / Node.s

    The official Node.js / Typescript library for the Groq API

    Groq TypeScript / Node.s (also often referred to as “groq-sdk” on npm) is the official Node.js / TypeScript client library for Groq’s REST API, enabling JavaScript/TypeScript developers to integrate LLM and AI-powered services into web backends, serverless functions, or frontend apps. It exports strongly-typed interfaces for models, chat completions, file uploads (e.g. for audio transcription), and other endpoints, allowing for better type safety and developer experience when using Groq from...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    SALMONN family

    SALMONN family

    A suite of advanced multi-modal LLMs

    SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    comfyui-mixlab-nodes

    comfyui-mixlab-nodes

    Workflow and speech recognition app

    comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB