Showing 50 open source projects for "audio processing"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    iPlug 2

    iPlug 2

    C++ Audio Plug-in Framework for desktop, mobile, xr and web

    iPlug 2 is a cross-platform C++ framework for developing audio plug-ins and applications that can target multiple formats and environments from a single codebase. It abstracts both the audio processing layer and the graphical user interface, allowing developers to focus on signal processing and design while the framework handles platform-specific details. The framework supports a wide range of plug-in standards, including VST, Audio Units, AAX, and newer formats like CLAP, enabling compatibility with major digital audio workstations. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    SFBAudioEngine

    SFBAudioEngine

    A powerhouse of audio functionality for macOS, iOS, and tvOS

    SFBAudioEngine is an advanced audio engine designed for macOS and iOS, focusing on high-quality playback, precise audio control, and support for a wide range of audio formats. Built for modern Apple platforms, it provides developers with a robust tool for integrating sophisticated audio functionalities into their applications. It emphasizes extensibility, performance, and clean API design.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    ...It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    OpenAI Go

    OpenAI Go

    The official Go library for the OpenAI API

    ...It enables developers to integrate OpenAI’s models and features into Go applications with a clean and idiomatic interface. The library provides support for a wide range of API endpoints including chat completions, assistants, embeddings, image generation, audio processing, and batch jobs. It includes built-in tools for handling authentication, managing API requests, and parsing structured responses. The repository also offers examples to help developers quickly set up projects and test different API calls. Designed for reliability and ease of use, it is maintained to stay aligned with the evolving OpenAI API specifications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    OmniTools

    OmniTools

    Self-hosted collection of powerful web-based tools for everyday tasks

    ...It’s designed to replace the random assortment of “free online tools” people use for quick tasks, while avoiding ads, tracking, and the need to upload sensitive files to unknown servers. A key design choice is that file processing happens entirely on the client side, meaning your data stays in your browser instead of being sent to the backend. The tool catalog spans both technical and non-technical needs, including image, video, audio, PDF, text, date/time, math, and data format utilities like JSON/CSV/XML helpers. It’s also packaged for straightforward self-hosting, with a lightweight Docker image and simple run commands, so it can be deployed quickly on a homelab or internal network.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    Membrane Core

    Membrane Core

    The core of Membrane Framework, multimedia processing framework

    membrane_core is the foundation of the Membrane multimedia framework for Elixir, providing the abstractions and runtime needed to build real-time audio and video pipelines. It models media processing as a graph of lightweight, supervised OTP processes—elements connected by links—so work is isolated, fault-tolerant, and easy to scale or reconfigure at runtime. The core defines a clear lifecycle and callback API for elements, plus concepts like buffers, events, and capabilities/format negotiation to keep components interoperable and type-safe. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SuperCollider

    SuperCollider

    Audio server, programming language, and IDE for sound synthesis

    SuperCollider is a platform for audio synthesis and algorithmic composition, used by musicians, artists, and researchers working with sound. It is free and open source software available for Windows, macOS, and Linux. scsynth, a real-time audio server, forms the core of the platform. It features 400+ unit generators (“UGens”) for analysis, synthesis, and processing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    ffmpeg.wasm

    ffmpeg.wasm

    FFmpeg for browser, powered by WebAssembly

    ffmpeg.wasm is a pure WebAssembly (and JavaScript/TypeScript) port of FFmpeg that enables in-browser media recording, conversion, and streaming—letting developers perform video/audio processing entirely client-side without server uploads. Transpiled via Emscripten from FFmpeg and its codecs into WebAssembly. Supports both single-threaded and multi-threaded cores using web workers. Written in TypeScript for improved developer experience.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    DALI

    DALI

    A GPU-accelerated library containing highly optimized building blocks

    The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentations. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Oboe

    Oboe

    Oboe is a C++ library that makes it easy to build high-performance

    oboe is a C++ library for building high-performance audio apps on Android, providing a unified, low-latency API over AAudio and OpenSL ES. It abstracts device and API-version differences so developers can focus on audio processing instead of platform quirks. The library emphasizes minimal latency and glitch-free playback/recording via tuned buffer strategies and callback-driven I/O. It supports features like floating-point audio, channel configuration, sample-rate negotiation, and stream sharing to match device capabilities. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    VERT.sh

    VERT.sh

    The next-generation file converter

    VERT is a modern, privacy-focused file conversion platform that leverages WebAssembly to perform conversions entirely on the user’s device rather than relying on cloud-based processing. Built with Svelte and TypeScript, it provides a clean and responsive interface for converting a wide variety of file types, including images, audio, video, and documents. One of its defining characteristics is its local-first approach, which eliminates the need to upload files to external servers, thereby improving both privacy and performance. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    miniaudio

    miniaudio

    Audio playback and capture library written in C,

    miniaudio is written in C with no dependencies except the standard library and should compile cleanly on all major compilers without the need to install any additional development packages. All major desktop and mobile platforms are supported. miniaudio gives you complete flexibility. With the low-level API, just initialize a connection to the device and send or receive raw audio data. The modular design of miniaudio allows you to use the low-level API without compromising your ability to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    OpenAI .NET

    OpenAI .NET

    The official .NET library for the OpenAI API

    OpenAI .NET is the official client library for calling the OpenAI REST API from C# and other .NET languages, with first-class support for modern .NET patterns. It provides strongly typed clients across API areas (chat, audio, images, embeddings, moderations, batches, files, models, vector stores, responses, realtime, assistants) and works with .NET Standard 2.0 while the examples use .NET 8. You install it via NuGet and authenticate with an API key, ideally through environment variables or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    YoutubeExplode

    YoutubeExplode

    Abstraction layer over YouTube's internal API

    ...Under the hood, the library parses raw page data and leverages reverse-engineered internal endpoints to obtain structured information and stream manifests. Developers can use it to access details such as titles, authors, durations, captions, and available media formats, as well as to download audio or video streams for further processing. The library is designed to be intuitive and cross-platform through .NET Standard compatibility, making it suitable for desktop tools, automation pipelines, and media utilities.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Flutter Rust Bridge

    Flutter Rust Bridge

    Rust binding generator, feature-rich, but seamless and simple

    ...The project supports passing complex types, handling async operations and streams, and integrating with Flutter across mobile and desktop targets. By leaning on Rust’s memory safety and zero-cost abstractions, it enables compute-heavy tasks—parsing, crypto, image/audio processing, and more—without sacrificing Flutter’s developer experience. Build scripts and templates streamline packaging and distribution so the Rust side fits cleanly into CI and multi-platform releases. In practice, teams gain a maintainable way to share one performant Rust core across multiple Flutter apps while keeping the UI reactive and fast.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Nyquist

    Nyquist

    Nyquist is a language for sound synthesis and music composition.

    Nyquist is a language for sound synthesis and music composition. It is implemented in C and C++ and runs on Win32, OSX, and Linux. Nyquist combines a powerful functional programming style with efficient signal-processing primitives. Nyquist is also embedded as a scripting language in Audacity.
    Leader badge
    Downloads: 26 This Week
    Last Update:
    See Project
  • 19
    Drumstick Libraries

    Drumstick Libraries

    MIDI libraries for Qt/C++

    Drumstick is a tool to play music. This is a set of C++ MIDI libraries using Qt5 objects, idioms and style. It contains a C++ wrapper around the ALSA library sequencer interface; ALSA sequencer provides software support for MIDI technology on Linux. A complementary library provides classes for SMF (Standard MIDI files: .MID/.KAR), and Cakewalk (.WRK) file formats processing. A multiplatform realtime MIDI I/O library is also provided.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Kisekae UltraKiss

    Kisekae UltraKiss

    Kisekae UltraKiss is a full featured integrated development environmen

    UltraKiss is a computer program that implements the Kisekae Set system, KiSS, a Japanese graphics system originally developed to facilitate costume changes on virtual dolls. UltraKiss was developed to help artists build their KiSS sets. It is a full featured viewer for all KiSS dolls, games, and visual applications. It is also a complete graphical development environment for creating KiSS applications. It fully implements the FKiSS event driven programming language up to and including...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    Glicol

    Glicol

    Graph-oriented live coding language and music/audio DSP library

    Glicol is a graph-oriented live coding language and audio engine designed for real-time music creation and digital signal processing, written entirely in Rust. It introduces a unique paradigm where audio synthesis and sequencing are represented as interconnected nodes, allowing developers and musicians to construct complex sound pipelines through declarative code. The language is designed to be accessible to beginners while still offering powerful capabilities for advanced users, enabling both quick experimentation and precise control over audio generation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 23
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    ...The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Beep

    Beep

    A little package that brings sound to any Go application

    A little package that brings sound to any Go application. Suitable for playback and audio processing. Beep is built on top of its Streamer interface, which is like io.Reader, but for audio. It was one of the best design decisions I've ever made and it enabled all the rest of the features to naturally come together with not much code. Decode and play WAV, MP3, OGG, and FLAC. Encode and save WAV. Very simple API. Limiting the support to stereo (two channel) audio made it possible to simplify the architecture and the API. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CHOW Phaser

    CHOW Phaser

    Phaser effect based loosely on the Schulte Compact Phasing 'A'

    ChowPhaser is an open-source audio plugin that emulates the classic Schulte Compact Phasing 'A' effect. It offers a unique phasing effect with nonlinear feedback and modulation capabilities, suitable for various audio processing applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB