audio processing free download

Showing 510 open source projects for "audio processing"

View related business solutions

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
1

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...

Downloads: 0 This Week

Last Update: 2026-01-27
See Project
2

Fun Audio Chat

Large Audio Language Model built for natural interactions

Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. ...

Downloads: 1 This Week

Last Update: 2026-02-27
See Project
3

OpenVINO AI Plugins for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.

Downloads: 135 This Week

Last Update: 2024-12-20
See Project
4

Step-Audio 2

Multi-modal large language model designed for audio understanding

Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. ...

Downloads: 0 This Week

Last Update: 2026-03-16
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
5

React Native Audio API

High-performance audio engine for react-native

React Native Audio API is a cross-platform audio library that brings a Web Audio API-style workflow to React Native. It is designed for apps that need real-time control over audio playback, recording, routing, effects, and signal processing. Developers can build consistent audio behavior across iOS, Android, and the web without rewriting separate logic for each platform.

Downloads: 6 This Week

Last Update: 2026-07-01
See Project
6

Ultimate Vocal Remover (UVR5)

GUI for a Vocal Remover that uses Deep Neural Networks

This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).

1 Review

Downloads: 1,019 This Week

Last Update: 2025-01-20
See Project
7

Faust

Functional programming language for signal processing

Faust (Functional Audio Stream) is a functional programming language for sound synthesis and audio processing with a strong focus on the design of synthesizers, musical instruments, audio effects, etc. Faust targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards. The core component of Faust is its compiler.

Downloads: 5 This Week

Last Update: 2026-07-01
See Project
8

Librosa

Python library for audio and music analysis

Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.

Downloads: 6 This Week

Last Update: 2025-07-03
See Project
9

StaxRip

Video encoding GUI for Windows

StaxRip is a powerful, open-source video and audio encoding GUI for Windows that orchestrates industry-standard console tools (such as x265, FFmpeg, mkvmerge) and frame-server systems (like AviSynth+ or VapourSynth) to allow users to transcode, mux, remux, or process media files with fine-grained control. It is not a “one-click” encoder; instead, it grants the user deep control over encoding settings, filtering, resizing, cropping, subtitles, audio processing, container formats, and more — making it a tool of choice for videophiles, enthusiasts, and anyone needing high-quality and customized media output. ...

62 Reviews

Downloads: 30 This Week

Last Update: 2026-06-06
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

FFmpegFreeUI

3FUI is ffmpeg's light professional interactive shell on Windows

...It supports dozens of video, audio, and image encoders, including hardware-accelerated options, and allows custom parameter input for advanced use cases. The software also includes batch processing capabilities, real-time progress tracking, plugin extensibility, and integration with FFmpeg utilities like ffprobe and ffplay.

Downloads: 24 This Week

Last Update: 2 days ago
See Project
11

iPlug 2

C++ Audio Plug-in Framework for desktop, mobile, xr and web

iPlug 2 is a cross-platform C++ framework for developing audio plug-ins and applications that can target multiple formats and environments from a single codebase. It abstracts both the audio processing layer and the graphical user interface, allowing developers to focus on signal processing and design while the framework handles platform-specific details. The framework supports a wide range of plug-in standards, including VST, Audio Units, AAX, and newer formats like CLAP, enabling compatibility with major digital audio workstations. ...

Downloads: 4 This Week

Last Update: 2026-04-11
See Project
12

Recorder

HTML5 js recording mp3 wav ogg webm amr format

...Provides multiple plug-in function support. Rich audio visualization, variable speed and pitch processing, speech recognition, audio stream playback, etc.; with powerful real-time processing support, it can be used in various web applications: from simple recording to complex real-time voice Recognition (ASR), and even audio-related games, are handled with ease.

Downloads: 3 This Week

Last Update: 3 days ago
See Project
13

ffmpeg-normalize

Audio Normalization for Python/ffmpeg

ffmpeg-normalize is a command-line utility designed to normalize audio levels in media files using FFmpeg, ensuring consistent volume across multiple tracks. It supports both EBU R128 loudness normalization and peak normalization methods, allowing users to choose the appropriate standard for their needs. The tool analyzes audio streams and applies adjustments to achieve target loudness levels without introducing distortion. It can process multiple files in batch mode, making it suitable for...

Downloads: 25 This Week

Last Update: 1 day ago
See Project
14

Shutter Encoder

A professional video compression tool accessible to all

Shutter Encoder is a cross-platform video and audio processing application designed to provide professional-grade encoding and conversion tools through an accessible graphical interface. Built primarily on FFmpeg, it offers a wide range of media operations including transcoding, compression, format conversion, and editing. The software supports numerous codecs and formats, enabling users to prepare media for broadcasting, streaming, or archiving.

Downloads: 22 This Week

Last Update: 2026-06-28
See Project
15

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...

Downloads: 1 This Week

Last Update: 2026-03-10
See Project
16

Moonshine Voice

Fast and accurate automatic speech recognition (ASR) for edge devices

...The project is designed to enable real-time voice applications such as live transcription, voice commands, and embedded speech interfaces without requiring heavy cloud infrastructure. Its architecture emphasizes low latency and flexible input handling, allowing audio streams of varying durations rather than relying on fixed processing windows. Moonshine supports multiple platforms including mobile, desktop, and embedded systems, and provides example projects to accelerate integration into real-world products. The toolkit also includes specialized model variants, including monolingual options that improve accuracy for specific languages. ...

Downloads: 15 This Week

Last Update: 3 hours ago
See Project
17

FFBox

A multimedia transcoded treasure chest / a FFmpeg case

FFBox is a graphical multimedia processing application that provides an accessible interface for working with FFmpeg operations such as encoding, conversion, and editing. It allows users to perform tasks like trimming, merging, and compressing media files without using command-line tools. The software supports a wide range of audio and video formats, making it suitable for diverse media workflows.

Downloads: 9 This Week

Last Update: 2026-06-07
See Project
18

wlmedia

Next audio and video player SDK

wlmedia is a multimedia processing and playback project designed to provide tools for handling video and audio content in lightweight environments. It focuses on integrating FFmpeg capabilities into streamlined workflows for decoding, encoding, and playback. The project includes support for multiple media formats and emphasizes performance and efficiency in resource-constrained systems.

Downloads: 1 This Week

Last Update: 2026-04-28
See Project
19

mediasoup

Cutting Edge WebRTC Video Conferencing

mediasoup is a Node.js library that provides a cutting-edge WebRTC server capable of handling real-time communications with efficient media routing and processing.

Downloads: 14 This Week

Last Update: 7 days ago
See Project
20

Google AI Edge Gallery

A gallery that showcases on-device ML/GenAI use cases

...The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid and a practical starting point: code is organized to show model loading, pre/post-processing, performance measurement, and common optimization knobs (quantization, NNAPI/Delegate usage, and hardware accelerators). The repo also collects small, well-documented models and conversion scripts so developers can reproduce a pipeline from a full-size model down to a device-friendly artifact.

Downloads: 270 This Week

Last Update: 2026-06-24
See Project
21

AutoSubSync

Automatic subtitle synchronization tool

AutoSubSync is a cross-platform desktop application designed to automatically synchronize subtitle files with video content using advanced alignment algorithms. It integrates tools like ffsubsync, autosubsync, and alass to analyze audio and match subtitle timing with high accuracy. The application supports both automatic synchronization and manual adjustment, allowing users to fine-tune results when needed. It provides a drag-and-drop interface that simplifies the process of loading video and subtitle files, making it accessible for non-technical users. AutoSubSync also includes batch processing capabilities, enabling users to handle entire media libraries efficiently. ...

Downloads: 47 This Week

Last Update: 2026-05-19
See Project
22

PyAV

Pythonic bindings for FFmpeg's libraries

...While powerful, it requires a solid understanding of FFmpeg concepts, as it prioritizes flexibility and control over abstraction. Overall, PyAV is a robust tool for developers building advanced video and audio processing systems in Python.

Downloads: 33 This Week

Last Update: 2026-07-02
See Project
23

AI-Media2Doc

AI tool converting video/audio into structured documents instantly

AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. ...

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
24

ThinkDSP

Digital Signal Processing in Python, by Allen B. Downey

Think DSP is an educational Python project that teaches digital signal processing through executable examples rather than starting with heavy mathematical formalism. It accompanies Allen B. Downey’s book and organizes most lessons as Jupyter notebooks. Readers work directly with waves, spectra, harmonics, filtering, convolution, and other signal-processing concepts. Early exercises show how to decompose sounds, modify frequency components, and synthesize new audio. ...

Downloads: 0 This Week

Last Update: 16 hours ago
See Project
25

SFBAudioEngine

A powerhouse of audio functionality for macOS, iOS, and tvOS

SFBAudioEngine is an advanced audio engine designed for macOS and iOS, focusing on high-quality playback, precise audio control, and support for a wide range of audio formats. Built for modern Apple platforms, it provides developers with a robust tool for integrating sophisticated audio functionalities into their applications. It emphasizes extensibility, performance, and clean API design.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project