Search Results for "audio processing" - Page 7

Showing 308 open source projects for "audio processing"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    T81 558

    T81 558

    Applications of Deep Neural Networks

    ...Application of these architectures to computer vision, time series, security, natural language processing (NLP), and data generation will be covered. High-Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    playerdemo

    playerdemo

    Android multimedia demonstration project

    playerdemo is an Android multimedia demonstration project that showcases how to build a custom video player using FFmpeg and native rendering techniques. It focuses on implementing the full playback pipeline, including decoding, rendering, and synchronization of audio and video streams. The project demonstrates how to integrate native C/C++ code with Java through JNI to achieve high-performance playback on mobile devices. It includes examples of handling different media formats, managing buffers, and controlling playback states. The architecture is designed for educational purposes, helping developers understand low-level media processing concepts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Riffusion

    Riffusion

    Real-time music generation using stable diffusion techniques AI

    ...Riffusion (hobby) serves as the core implementation for audio and image processing, providing essential building blocks for generating music from text prompts. It includes both developer-oriented tools and user-facing components such as a command-line interface and an interactive Streamlit application for experimentation. Additionally, it can run as a Flask server to expose model inference through an API, enabling integration with other applications or services.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Glicol

    Glicol

    Graph-oriented live coding language and music/audio DSP library

    Glicol is a graph-oriented live coding language and audio engine designed for real-time music creation and digital signal processing, written entirely in Rust. It introduces a unique paradigm where audio synthesis and sequencing are represented as interconnected nodes, allowing developers and musicians to construct complex sound pipelines through declarative code. The language is designed to be accessible to beginners while still offering powerful capabilities for advanced users, enabling both quick experimentation and precise control over audio generation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    Sneedacity

    Sneedacity

    Audio Editor

    ...Macros for chaining commands and batch processing. Scripting in Python, Perl, or any language that supports named pipes. Nyquist Very powerful built-in scripting language that may also be used to create plug-ins. Editing multi-track editing with sample accuracy and arbitrary sample rates. Accessibility for VI users. Analysis and visualization tools to analyze audio or other signal data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 7
    wasmboy

    wasmboy

    Game Boy / Game Boy Color Emulator Library

    wasmboy is a Game Boy and Game Boy Color emulator built using WebAssembly and JavaScript, designed to run efficiently in both browsers and Node environments. It leverages modern web technologies such as HTML5 canvas and the Web Audio API to deliver graphics and sound directly within a web interface. The project emphasizes portability and integration, allowing it to be embedded into other applications as a reusable dependency. It supports a wide range of emulator features including save...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9

    AhoTTS - TTS for Basque and Spanish

    Text-to-Speech for Basque and Spanish

    Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    AutoSub

    AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

    AutoSub is a Python-based tool designed to automatically generate subtitles for video or audio content using speech recognition technology. It processes media files by extracting audio, transcribing spoken content, and generating subtitle files in standard formats. The tool supports multiple languages and can integrate with translation systems to produce subtitles in different languages. It is designed for automation, allowing batch processing of multiple media files. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Music Source Separation

    Music Source Separation

    Separate audio recordings into individual sources

    Music Source Separation is a PyTorch-based open-source implementation for the task of separating a music (or audio) recording into its constituent sources — for example isolating vocals, instruments, bass, accompaniment, or background from a mixed track. It aims to give users the ability to take any existing song and decompose it into separate stems (vocals, accompaniment, etc.), or to train custom separation models on their own datasets (e.g. for speech enhancement, instrument isolation, or...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    ...The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    hora

    hora

    Efficient approximate nearest neighbor search algorithm collections

    hora is an open-source high-performance vector similarity search library designed for large-scale machine learning and information retrieval systems. The project focuses on approximate nearest neighbor search, a fundamental technique used in modern AI applications such as recommendation systems, image search, and semantic search engines. Hora implements multiple efficient indexing algorithms that allow systems to rapidly search through high-dimensional vectors produced by machine learning...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Vivid 3D

    Vivid 3D

    Vivid is a modern C++ 3D engine using OpenGL4+

    Vivid is a modern C++ 3D engine using OpenGL4+. It is written using Visual C++ 2022, and relies on several open source projects to achieve it's goal of making it easy and run to make modern games with it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    AAXtoMP3

    AAXtoMP3

    Convert Audible's .aax filetype to MP3, FLAC, M4A, or OPUS

    ...AAXtoMP3 supports batch processing, enabling users to convert multiple files in a single workflow. Its minimal setup and script-based usage make it suitable for automation and integration into personal media pipelines. Overall, it provides a practical solution for managing audiobook libraries in open formats.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Telegram WebRTC (VoIP)

    Telegram WebRTC (VoIP)

    Voice chats, private incoming and outgoing calls in Telegram

    Telegram WebRTC (VoIP) is a Python and C++ library that enables real-time voice and video communication features for Telegram bots and clients. It provides an interface for joining, managing, and streaming audio or video in Telegram group calls and voice chats. The library is built on top of low-level communication protocols, ensuring efficient handling of real-time media streams. It supports integration with FFmpeg and other tools for processing audio and video before transmission. tgcalls allows developers to create bots that can play music, stream content, or interact with live voice channels programmatically. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Beep

    Beep

    A little package that brings sound to any Go application

    A little package that brings sound to any Go application. Suitable for playback and audio processing. Beep is built on top of its Streamer interface, which is like io.Reader, but for audio. It was one of the best design decisions I've ever made and it enabled all the rest of the features to naturally come together with not much code. Decode and play WAV, MP3, OGG, and FLAC. Encode and save WAV. Very simple API. Limiting the support to stereo (two channel) audio made it possible to simplify the architecture and the API. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Delphi ASIO & VST Packages
    With these packages for Delphi the user can easily create VST plugins or ASIO applications within minutes. The included algorithms for filters and dynamics help to built effects without much knowledge of digital signal processing.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 19
    VAD

    VAD

    Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM

    This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. Acoustic feature extraction (multi-resolution cochleagram, MRCG). Post-processing modules (e.g. smoothing, thresholds). The toolkit supports both MATLAB and Python/TensorFlow components (for feature extraction, classification, postprocessing). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ACToolkit's temporary homepage

    ACToolkit's temporary homepage

    Max-objects-&-patches for Algorithmic Composition and statistical DSP

    It was formerly jey-Toolkit. Now it's renamed and also being distributed by Cycling '74. Through the Files menu that appears above, the same package as released at the Package Manager of Max 7/8 and the beta-version update(s) are accessible. PLEASE NOTE that, on Max 6.1.10, some Jitter features and GEN~ objects that the patches included in the package use WON'T be working. The name of our package ACToolkit is a derivative of AC Toolbox, the legendary "Algorithmic music composition program...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    Data augmentation

    Data augmentation

    List of useful data augmentation resources

    List of useful data augmentation resources. You will find here some links to more or less popular github repos, libraries, papers, and other information. Data augmentation can be simply described as any method that makes our dataset larger. To create more images for example, we could zoom in and save a result, we could change the brightness of the image or rotate it. To get a bigger sound dataset we could try to raise or lower the pitch of the audio sample or slow down/speed up....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CHOW Phaser

    CHOW Phaser

    Phaser effect based loosely on the Schulte Compact Phasing 'A'

    ChowPhaser is an open-source audio plugin that emulates the classic Schulte Compact Phasing 'A' effect. It offers a unique phasing effect with nonlinear feedback and modulation capabilities, suitable for various audio processing applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Denoiser

    Denoiser

    Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

    ...The implementation includes data augmentation techniques applied to the raw waveforms (e.g. noise mixing, reverberation) to improve model robustness and generalization to diverse noise types. The project supports both offline denoising (batch inference) and live audio processing (e.g. via loopback audio interfaces), making it practical for real-time use in calls or recording. The codebase includes training and evaluation scripts, configuration management via Hydra, and pretrained models on standard noise datasets.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Live Transcribe Speech Engine

    Live Transcribe Speech Engine

    Live Transcribe is an Android application

    ...Its design prioritizes latency and robustness in noisy, far-field environments, enabling continuous transcription with low delay on mobile hardware. The engine manages audio front-end processing—such as noise suppression and voice activity detection—before feeding audio into compact, accurate acoustic and language models. Partial hypotheses stream as words are recognized, then stabilize with minimal jitter as confidence increases, which is crucial for usability. The code emphasizes efficient use of CPU and neural accelerators to balance battery life with responsiveness. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    X32 Scene Parser

    X32 Scene Parser

    An X32 scene management tool

    This parsing tool can be used to extract sections of a Behringer X32 or Midas M32 scene file in order to create specialized snippets.
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB