Showing 44 open source projects for "audio processing"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Librosa

    Librosa

    Python library for audio and music analysis

    Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    ffmpeg-normalize

    ffmpeg-normalize

    Audio Normalization for Python/ffmpeg

    ffmpeg-normalize is a command-line utility designed to normalize audio levels in media files using FFmpeg, ensuring consistent volume across multiple tracks. It supports both EBU R128 loudness normalization and peak normalization methods, allowing users to choose the appropriate standard for their needs. The tool analyzes audio streams and applies adjustments to achieve target loudness levels without introducing distortion. It can process multiple files in batch mode, making it suitable for...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    ...NovaSR is especially valuable for post-processing tasks in speech enhancement, TTS pipelines, and dataset restoration where low sampling rates degrade perceived audio clarity; the minimal model size also makes it suitable for edge and embedded use cases where memory is at a premium. Its performance can reach thousands of times realtime on modern GPUs, allowing massive audio batches to be processed with negligible compute overhead.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    AutoSubSync

    AutoSubSync

    Automatic subtitle synchronization tool

    AutoSubSync is a cross-platform desktop application designed to automatically synchronize subtitle files with video content using advanced alignment algorithms. It integrates tools like ffsubsync, autosubsync, and alass to analyze audio and match subtitle timing with high accuracy. The application supports both automatic synchronization and manual adjustment, allowing users to fine-tune results when needed. It provides a drag-and-drop interface that simplifies the process of loading video and subtitle files, making it accessible for non-technical users. AutoSubSync also includes batch processing capabilities, enabling users to handle entire media libraries efficiently. ...
    Downloads: 29 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    ...It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Twspace-dl

    Twspace-dl

    A python module to download twitter spaces

    Twspace-dl is a Python-based tool designed to download audio content from Twitter Spaces, enabling users to archive live or recorded sessions locally. It works by extracting streaming URLs and processing them with FFmpeg to generate downloadable audio files. The tool supports both command-line and graphical interfaces, making it accessible to different types of users. It requires authentication via exported cookies due to API restrictions, ensuring access to protected content. twspace-dl also includes options for saving metadata, playlists, and cover art associated with the stream. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ReClip

    ReClip

    Download videos from almost any website

    ...Users can paste multiple URLs at once, select output formats such as MP4 or MP3, and choose quality settings before downloading. The system also includes features like automatic URL deduplication and batch processing to improve usability.
    Downloads: 149 This Week
    Last Update:
    See Project
  • 8
    cloud-morph

    cloud-morph

    Decentralize, Self-host Cloud Gaming/Application

    cloud-morph is a cloud-based media processing service that enables real-time video and audio transformation using FFmpeg in scalable environments. It is designed to run as a backend service that processes media streams or files and applies transformations such as transcoding, filtering, and format conversion. The system supports API-driven workflows, allowing integration into web applications or automation pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Videomass

    Videomass

    Videomass is a free, open source and cross-platform GUI for FFmpeg

    Videomass is a free, open-source graphical interface for FFmpeg designed to make advanced video and audio processing accessible to both beginners and experienced users. Built in Python using wxPython, it provides a cross-platform environment for managing encoding, conversion, and editing tasks through a visual interface. The software supports multitasking operations, allowing users to process multiple media files simultaneously. It offers extensive configuration options while also providing presets to simplify common workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    A2M — Audio to MIDI

    A2M — Audio to MIDI

    A2M is a desktop app that converts AUDIO TO MIDI in one click.

    A2M (Audio To MIDI) is a simple desktop tool for transcribing local audio files into MIDI files with one click. It is designed primarily for piano recording transcription, and works best on solo piano recordings. Using A2M is straightforward: Select an audio file, click Convert, and the application generates a MIDI file automatically in your Downloads/A2M folder.
    Leader badge
    Downloads: 73 This Week
    Last Update:
    See Project
  • 11
    LiveAvatar

    LiveAvatar

    Streaming Real-time Audio-Driven Avatar Generation

    LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions. The project co-designs algorithms and system optimizations, such as block-wise autoregressive processing and fast sampling strategies, to deliver real-time frame rates (e.g., ~45 FPS on appropriate GPU clusters) while handling non-stop generation without quality degradation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PyAV

    PyAV

    Pythonic bindings for FFmpeg's libraries

    ...While powerful, it requires a solid understanding of FFmpeg concepts, as it prioritizes flexibility and control over abstraction. Overall, PyAV is a robust tool for developers building advanced video and audio processing systems in Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JamTools

    JamTools

    JamTools is a cross-platform gadget set software

    JamTools is a multifunctional desktop utility suite designed to provide a collection of tools for productivity, media processing, and system enhancements within a single application. It integrates various features such as file management, multimedia handling, and system utilities into a unified interface. The project emphasizes ease of use while offering advanced functionality for handling common tasks efficiently. It includes support for media-related operations, often leveraging FFmpeg for processing video and audio content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Verticals v3

    Verticals v3

    Automated YouTube Shorts pipeline

    ...The pipeline emphasizes automation, allowing users to produce short-form content at scale with minimal manual intervention. It integrates FFmpeg and other media processing tools to handle video transformations, resizing, and encoding. The system also supports adding overlays, captions, and audio enhancements to improve engagement. Designed for creators and developers, it enables repeatable workflows for generating social media content efficiently. Its modular structure allows customization of each stage in the pipeline, making it adaptable to different content strategies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    video-use

    video-use

    Edit videos with Claude Code

    ...Designed to work with Claude Code, it automates the entire editing process—from cutting clips to rendering the final output—without requiring manual timelines or complex software interfaces. The system intelligently analyzes audio transcripts and visual cues to make precise, context-aware editing decisions. It supports a wide range of content types, including interviews, tutorials, montages, and talking-head videos. By combining structured text representations with on-demand visual previews, it minimizes processing overhead while maintaining high-quality results. ...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 16
    MoviePy

    MoviePy

    Video editing with Python

    MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF. MoviePy is an open source software originally written by Zulko and released under the MIT licence. It works on Windows, Mac, and Linux, with Python 2 or Python 3. The code is hosted on Github, where...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 17
    Internet DJ Console

    Internet DJ Console

    A feature packed DJ console and internet radio client for Linux users

    Conceived as an internet radio Shoutcast/Icecast client and DJ console IDJC has two main media players, a background track player, effects buttons, crossfader, webm, aac, ogg, and mp3 streaming, stream automation timers, aux input, voice and VoIP integration. Media file formats include: mp3, ogg, flac, wma, wav, m4a, m3u, xspf, pls, and cue sheet support, IRC track and station announcements, uses jack audio connection kit to provide a flexible audio chain. This list of features is by no...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    VCClient

    VCClient

    Software that uses AI to perform real-time voice conversion

    VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output. It provides both a...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 19
    MLT Multimedia Framework
    A multimedia authoring and processing framework and a video playout server for television broadcasting.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 20
    MahaKurawa.My.ID MP4 VA Extract

    MahaKurawa.My.ID MP4 VA Extract

    MahaKurawa.My.ID MP4 VA Extract is a tool to extract mp4 file content

    MahaKurawa.My.ID MP4 VA Extract is a tool to extract MP4 file video and audio content. It also have ability to extract MKV file and single SSA Subtitle file. This software will not convert any video and audio file from MP4 file. This software just extract them as it is. This tool is made for that specific purpose. This tool "MahaKurawa.My.ID MP4 VA Extract v.1.0.3.1" can be obtained for free on https://www.mahakurawa.my.id.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    auto-subtitle

    auto-subtitle

    Automatically generate and overlay subtitles for any video

    auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Automatic YouTube subtitle generation

    Automatic YouTube subtitle generation

    Using OpenAI's Whisper to automatically generate YouTube subtitles

    Automatic YouTube subtitle generation is a command-line tool that combines YouTube downloading capabilities with AI-powered transcription using Whisper models. It allows users to download videos or audio from YouTube and automatically generate subtitles or transcripts. The tool processes media locally, extracting audio and applying speech recognition to produce accurate text outputs. It supports multiple languages and can handle different Whisper model sizes, balancing performance and accuracy. yt-whisperc is designed for automation, enabling batch processing of multiple videos for transcription workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AutoSub

    AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

    AutoSub is a Python-based tool designed to automatically generate subtitles for video or audio content using speech recognition technology. It processes media files by extracting audio, transcribing spoken content, and generating subtitle files in standard formats. The tool supports multiple languages and can integrate with translation systems to produce subtitles in different languages. It is designed for automation, allowing batch processing of multiple media files. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Telegram WebRTC (VoIP)

    Telegram WebRTC (VoIP)

    Voice chats, private incoming and outgoing calls in Telegram

    Telegram WebRTC (VoIP) is a Python and C++ library that enables real-time voice and video communication features for Telegram bots and clients. It provides an interface for joining, managing, and streaming audio or video in Telegram group calls and voice chats. The library is built on top of low-level communication protocols, ensuring efficient handling of real-time media streams. It supports integration with FFmpeg and other tools for processing audio and video before transmission. tgcalls allows developers to create bots that can play music, stream content, or interact with live voice channels programmatically. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Savify

    Savify

    Download Spotify songs to mp3 with full metadata and cover art

    Savify is a command-line tool designed to download and archive music from Spotify by leveraging YouTube as the audio source while preserving Spotify metadata. It allows users to input playlists, albums, or individual tracks and automatically retrieves matching audio files with proper tagging. The tool integrates FFmpeg and yt-dlp to handle downloading, conversion, and formatting into common audio formats such as MP3. It enriches files with metadata including artist, album, cover art, and...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB