Search Results for "audio processing" - Page 7

Showing 374 open source projects for "audio processing"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Meqaris

    Meqaris

    Booking/reservation of meeting rooms/equipment with e-mail invitations

    Meqaris (Meeting Equipment and Room Invitation System) is a system that allows booking meeting/conference rooms and other equipment or resources (like mobile whiteboards, projectors or conference audio/video sets) by using the same type of e-mail invitations that are used to invite participants to meetings. Especially useful in corporate environments, but can be used for anything in general, even by individual users. Simply add "resource participants" to the recipient list (just like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    downkyi

    downkyi

    Bilibili video downloader supporting 8K, batch, and toolbox tools

    downkyi is an open-source downloader for Bilibili videos. It features a clean UI, QR-code login, batch downloads, support for 8K, HDR, Dolby Vision, audio/video extraction, watermark removal, and subtitle/danmaku retrieval. It leverages aria2c for multi-threaded downloading and FFmpeg for muxing and processing.
    Downloads: 88 This Week
    Last Update:
    See Project
  • 3
    LoopAuditioneer

    LoopAuditioneer

    Software for loop and cue handling in .wav files.

    Since 2024-02-28 development of LoopAuditioneer has moved to GitHub. Please go to https://github.com/GrandOrgue/LoopAuditioneer for recent updates and releases.
    Leader badge
    Downloads: 53 This Week
    Last Update:
    See Project
  • 4
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. The...
    Downloads: 70 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    ekho

    ekho

    Chinese text-to-speech engine

    ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Piper

    Piper

    A distributed workflow engine

    Piper is a multimedia-focused tool designed to simplify audio and video processing workflows through streamlined command execution. It acts as a wrapper around FFmpeg-like utilities, enabling users to build pipelines for media transformation with reduced complexity. The project emphasizes automation and reproducibility, allowing consistent handling of media tasks across environments. It supports chaining operations such as encoding, filtering, and conversion in a structured manner. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    AudioEqualizer
    Introducing AudioEqualize: Elevate Your Audio Experience! AudioEqualize isn't just your average volume adjustment tool; it's a sophisticated audio wizard that goes beyond simple peak amplitude normalization. Designed to enhance your music library, AudioEqualize meticulously analyzes and precisely tunes your MP3 files to a target volume of your choice. Here's why it's the ultimate choice for audio enthusiasts:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
    Leader badge
    Downloads: 19 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 10
    QMDemo

    QMDemo

    Some functional modules developed by Qt on a daily basis or demos

    QMDemo is an Android demonstration project that showcases multimedia playback and processing capabilities using native and Java-based components. It is designed as a learning tool for developers exploring video playback, decoding, and rendering pipelines on mobile devices. The project includes examples of handling media streams, managing buffers, and synchronizing audio and video output. It demonstrates integration with multimedia libraries and frameworks to achieve efficient playback performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Objective-Oriented Directivity

    MATLAB toolbox for processing directivity models

    The project is a framework developed in the form of a MATLAB toolbox, which aims to bring common interface for various directivity representations in acoustics. The legacy version was described in paper 10521 at 151st Audio Engineering Society Convention (https://arxiv.org/abs/2109.14370). The preprint on the current, improved version, can be found here: https://arxiv.org/abs/2206.12283. Currently not submitted anywhere, please refer to the toolbox by citing this website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    T81 558

    T81 558

    Applications of Deep Neural Networks

    ...Application of these architectures to computer vision, time series, security, natural language processing (NLP), and data generation will be covered. High-Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    playerdemo

    playerdemo

    Android multimedia demonstration project

    playerdemo is an Android multimedia demonstration project that showcases how to build a custom video player using FFmpeg and native rendering techniques. It focuses on implementing the full playback pipeline, including decoding, rendering, and synchronization of audio and video streams. The project demonstrates how to integrate native C/C++ code with Java through JNI to achieve high-performance playback on mobile devices. It includes examples of handling different media formats, managing buffers, and controlling playback states. The architecture is designed for educational purposes, helping developers understand low-level media processing concepts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Riffusion

    Riffusion

    Real-time music generation using stable diffusion techniques AI

    ...Riffusion (hobby) serves as the core implementation for audio and image processing, providing essential building blocks for generating music from text prompts. It includes both developer-oriented tools and user-facing components such as a command-line interface and an interactive Streamlit application for experimentation. Additionally, it can run as a Flask server to expose model inference through an API, enabling integration with other applications or services.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Glicol

    Glicol

    Graph-oriented live coding language and music/audio DSP library

    Glicol is a graph-oriented live coding language and audio engine designed for real-time music creation and digital signal processing, written entirely in Rust. It introduces a unique paradigm where audio synthesis and sequencing are represented as interconnected nodes, allowing developers and musicians to construct complex sound pipelines through declarative code. The language is designed to be accessible to beginners while still offering powerful capabilities for advanced users, enabling both quick experimentation and precise control over audio generation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Sneedacity

    Sneedacity

    Audio Editor

    ...Macros for chaining commands and batch processing. Scripting in Python, Perl, or any language that supports named pipes. Nyquist Very powerful built-in scripting language that may also be used to create plug-ins. Editing multi-track editing with sample accuracy and arbitrary sample rates. Accessibility for VI users. Analysis and visualization tools to analyze audio or other signal data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. ...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 19
    xt7-player aims to be a complete gui to mplayer with library and playlist support, built with usability in mind.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    wasmboy

    wasmboy

    Game Boy / Game Boy Color Emulator Library

    wasmboy is a Game Boy and Game Boy Color emulator built using WebAssembly and JavaScript, designed to run efficiently in both browsers and Node environments. It leverages modern web technologies such as HTML5 canvas and the Web Audio API to deliver graphics and sound directly within a web interface. The project emphasizes portability and integration, allowing it to be embedded into other applications as a reusable dependency. It supports a wide range of emulator features including save...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22

    AhoTTS - TTS for Basque and Spanish

    Text-to-Speech for Basque and Spanish

    Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AutoSub

    AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

    AutoSub is a Python-based tool designed to automatically generate subtitles for video or audio content using speech recognition technology. It processes media files by extracting audio, transcribing spoken content, and generating subtitle files in standard formats. The tool supports multiple languages and can integrate with translation systems to produce subtitles in different languages. It is designed for automation, allowing batch processing of multiple media files. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Music Source Separation

    Music Source Separation

    Separate audio recordings into individual sources

    Music Source Separation is a PyTorch-based open-source implementation for the task of separating a music (or audio) recording into its constituent sources — for example isolating vocals, instruments, bass, accompaniment, or background from a mixed track. It aims to give users the ability to take any existing song and decompose it into separate stems (vocals, accompaniment, etc.), or to train custom separation models on their own datasets (e.g. for speech enhancement, instrument isolation, or...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    ...The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB