Showing 28 open source projects for "batch audio equalizer"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Ultimate Vocal Remover (UVR5)

    Ultimate Vocal Remover (UVR5)

    GUI for a Vocal Remover that uses Deep Neural Networks

    This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).
    Downloads: 599 This Week
    Last Update:
    See Project
  • 2
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    ...In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on the go, or for users who prefer audio over reading. The repository supports handling common ebook formats and generating outputs that combine audio plus caption metadata. By automating text-to-speech for arbitrary documents, abogen reduces the friction of producing audiobooks and could be integrated into larger workflows (e.g., batch converting a library of texts).
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    ...The library is asynchronous under the hood, which makes it efficient for batch jobs or web services that need to synthesize many utterances concurrently.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 16 This Week
    Last Update:
    See Project
  • Powerful App Monitoring Without Surprise Bills Icon
    Powerful App Monitoring Without Surprise Bills

    AppSignal starts at $23/month with all features included. No overages, no hidden fees. 30-day free trial.

    Tired of monitoring tools that punish you for scaling? AppSignal offers transparent, predictable pricing with every feature unlocked on every plan. Track errors, monitor performance, detect anomalies, and manage logs across Ruby, Python, Node.js, and more. Trusted by developers since 2012 with free dev-to-dev support. No credit card required to start your 30-day trial.
    Try AppSignal Free
  • 5
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    ...It integrates with the broader Hugging Face ecosystem, making it easier to load pretrained models and run inference. It also serves as a foundation for building real-time or batch audio transformation systems. Overall, it highlights an emerging approach to voice technology that reduces latency and preserves more of the original speech characteristics.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    notebooklm-py

    notebooklm-py

    Unofficial Python API and agentic skill for Google NotebookLM

    notebooklm-py is an unofficial Python API and agent-ready integration layer for Google NotebookLM that exposes NotebookLM functionality through code, the command line, and AI agent workflows. Its goal is to provide programmatic access not just to standard notebook operations, but also to many capabilities that are either limited or unavailable in the web interface, making it especially useful for automation and custom pipelines. The project covers notebook management, source ingestion,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which makes sophisticated agent behaviors easy to express with simple operators. The library offers built-in processors for classic turn-based Gemini calls as well as Live API streaming, so you can mix “batch” and real-time interactions in the same graph. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 9
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ...It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha” system, which batch-generates many distinct voice timbres and allows users to save the ones they like into a curated voice library. It has first-class support for long-form audio generation, making it suitable for audiobooks, podcasts, or long narration tasks. The project also implements multi-speaker or role-based reading, letting users assign different voices to different characters in a script and even use a large language model to generate that script in one step.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...A small CLI utility, gtts-cli, makes it easy to test or batch-generate MP3 files right from the shell.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time and offline workflows in mind, enabling applications from consumer editing to professional content creation and batch processing. Internally optimized for multi-core processors and hardware acceleration where available, LTX-Video makes it feasible to work with high-resolution content and complex timelines without sacrificing responsiveness.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    ## About: Automatically transcribe all of your voice recordings into clean, organized, neat text files. It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file. One sentence per new line. ## Notes: - Its 2GB in size and requires 2-6GB of GPU VRAM too. (basically...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Metagify

    Metagify

    Audio metadata editor with MusicBrainz integration.

    Metagify is an open-source desktop application designed to provide a streamlined solution for editing audio file metadata. Built with Python and PyQt5, it offers a powerful and intuitive interface for single-file and batch-editing of tags, as well as seamless integration with the MusicBrainz database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Pot-O MusiQT

    Pot-O MusiQT

    Official Repository for Pot-O MusiQT

    Pot-O MusiQT is a lightweight yet feature-rich desktop music player built with Python and PyQt5, designed for users who want a clean interface, strong playlist control, and practical everyday playback features without unnecessary complexity. It focuses on local media playback, fast interaction, and keyboard-friendly operation, while still offering modern conveniences such as metadata handling, lyrics viewing, and smooth playback transitions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    ...Demucs supports GPU-accelerated inference and can process multi-channel audio with chunked streaming for real-time or batch operation. It also provides training scripts and utilities to fine-tune on custom datasets, along with remixing and enhancement tools.
    Downloads: 84 This Week
    Last Update:
    See Project
  • 19
    Tidal-Media-Downloader

    Tidal-Media-Downloader

    Download 'TIDAL' Music On Windows/Linux/MacOs (PYTHON/C#)

    Tidal-Media-Downloader is an application that lets you download videos and tracks from Tidal. It supports two versions, tidal-dl and tidal-gui. (This repository only contains tidal-dl, and the release isn't the newest gui version.)
    Downloads: 71 This Week
    Last Update:
    See Project
  • 20
    Denoiser

    Denoiser

    Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

    ...The implementation includes data augmentation techniques applied to the raw waveforms (e.g. noise mixing, reverberation) to improve model robustness and generalization to diverse noise types. The project supports both offline denoising (batch inference) and live audio processing (e.g. via loopback audio interfaces), making it practical for real-time use in calls or recording. The codebase includes training and evaluation scripts, configuration management via Hydra, and pretrained models on standard noise datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    oscp

    oscp

    multiplatform, small and handy audio/video player with network remote

    oscp by ariel/KotCzarny @ irc://irc.freenode.com/h3droid tiny media player: - libav (mp3, wav, ogg, aac, mpc, wma, flac, ape, avi, mkv, flv, etc.) - wildmidi (mid) - xmp (mod, med, xm, s3m, it, dbm, psm, omx, okt, digi, 669, mtm, acid, umx) - gme (ay, gbs, gym, hes, kss, nsf, nsfe, sap, spc, vgm) - sidplay2 (sid, psid, info) - mdxplay (mdx) - fc14dec (fc, fc13, fc14) - sc68 (sc68, sndh) - asapconv (sap, cmc, cm3, cmr, cms, dmc, dlt, mpt, mpd, rmt, tmc, tm8, tm2) -...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Peyote
    Peyote is an audio player with MC-like interface It designed specifically for work easy with cue sheets. It supports wv ( wavepack ), wav, flac, ape, ogg, mp4, vma and mp3 formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Youtube-DLG

    Youtube-DLG

    A cross platform front-end GUI of the popular youtube-dl downloader

    YouTube-DLG is a user-friendly graphical interface for the popular command-line tool YouTube-DL, designed to simplify downloading videos and audio from YouTube and other platforms. It allows users to save content in various formats and resolutions, including MP3 and MP4, with just a few clicks. YouTube-DLG supports batch downloading, custom output settings, and format selection, making it a versatile solution for managing downloads efficiently. With its lightweight design and cross-platform compatibility, YouTube-DLG is an excellent choice for users seeking a simple yet powerful way to access and organize media offline.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 24
    Cute Giraffe

    Cute Giraffe

    Qt-based Graphical Interface Wrapper for FFMPEG

    Qt-based Graphical Interface Wrapper for FFMPEG Cute Giraffe is now part of LibreEngineering suite: http://sourceforge.net/projects/libreeng/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    EnKoDeur-Mixeur
    EnKoDeur-Mixeur (EKD) is an open source software which makes videos, pictures and audio post-production. It can be also used to convert videos in many formats. It is written in python and use the PyQt4 bindings.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB