Search Results for "audio processing" - Page 6

Showing 308 open source projects for "audio processing"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Drumstick Libraries

    Drumstick Libraries

    MIDI libraries for Qt/C++

    Drumstick is a tool to play music. This is a set of C++ MIDI libraries using Qt5 objects, idioms and style. It contains a C++ wrapper around the ALSA library sequencer interface; ALSA sequencer provides software support for MIDI technology on Linux. A complementary library provides classes for SMF (Standard MIDI files: .MID/.KAR), and Cakewalk (.WRK) file formats processing. A multiplatform realtime MIDI I/O library is also provided.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3

    mediaPlaylists

    Creating and maintaining media player playlists for audio and video

    Tools for creating and maintaining media player playlists for audio and video libraries, including support for environments that use Twonky or similar DLNA‑based servers. These command‑line utilities focus on predictable behavior, transparent processing and compatibility with common playlist formats such as M3U and WPL. The design allows additional formats to be added easily without modifying the core workflow.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4

    Freeverb3_vst

    Freeverb3 DSP VST effect plugins

    The Freeverb3VST is a package of VST DSP effect plugins utilizing the Freeverb3 signal processing library. Many types of audio processing effects including high quality reverbs and impulse response convolution processors are available.
    Downloads: 12 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Yang YouTube Downloader

    Yang YouTube Downloader

    Downloads best-quality audio and video from YouTube

    This YouTube downloader allows you to get the best streams without re-encoding to preserve the best quality. While VP9 is 35% more efficient than MP4 for videos, some videos have 40-60% smaller file sizes in VP9 format! It will automatically select the best-quality video based on file sizes. It can even combine MP4 videos with Opus audios in a MKV file, although not all players will support it. I haven't seen any other downloader that can produce a MKV file with the best video and audio...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Qwen Chat

    Qwen Chat

    An AI assistant for everyone, powered by the Qwen series models

    Qwen Chat is a versatile AI assistant powered by the advanced Qwen series models, designed for creativity, collaboration, and problem-solving. It excels at deep reasoning and cognitive tasks, helping users solve complex problems in math, science, coding, and more. The AI supports creative writing by generating narratives, characters, and plot ideas, blending imagination with logical coherence. Qwen Chat’s web search feature delivers fast, accurate, and real-time answers sourced from...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 7
    MLT Multimedia Framework
    A multimedia authoring and processing framework and a video playout server for television broadcasting.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 8
    Free Fps. Video FPS Converter

    Free Fps. Video FPS Converter

    Desktop app to change a video FPS

    Free FPS is an open‑source desktop app and scripts to change a video file frame rate (FPS) using FFmpeg. Unlike video editors, it does not add effects or alter content - it only adjusts playback speed and, if needed, re-encodes audio as well. Useful if you work with multiple videos shot at different frame rates that cannot be combined or edited without interpolation or frame loss. Also doubles as a fast video compressor: keep the original FPS and raise compression (e.g., higher CRF or...
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 9
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Leader badge
    Downloads: 338 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    DiffRhythm

    DiffRhythm

    Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

    DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Data Crow

    Data Crow

    The ultimate cataloguer

    Data Crow allows you to use the standard movie & video (divx, xvid, DVD, Blu-ray, etc), book (and eBooks), images, board games, comic books, games & software, music (mp3 and other music files) cataloguing modules. Besides these modules, which you can change to fit your requirements, you can create new modules (want to catalogue your stamps, equipment, or anything else?). The GUI is skinnable. Reporting (using JasperReports and their community edition JasperSoft Developer Studio ), loan...
    Leader badge
    Downloads: 302 This Week
    Last Update:
    See Project
  • 12

    Virtualdub Batch Video DeShake v26.0204

    Batch to compress [and deshake] all videos [or images] in folder

    Installation: Execute "DeShakInst.BAT" VirtualDub2 44282; AviSynth+ 3.7.5 updated to C:\DVD DESHAK.BAT updated to C:\UT and added to PATH Usage: DESHAK task[s] [parameters] Tasks: tp1: deshake pass1 LOG generation for 2nd pass tp2: deshake pass2 and compress video and audio to MP3 tcomp: compress (no deshake) twav: extract WAV and/or uses external WAV audio Parameters (more in help): vEXT: video extension (ie: vmov), default: vAVI qN: h264...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Kisekae UltraKiss

    Kisekae UltraKiss

    Kisekae UltraKiss is a full featured integrated development environmen

    UltraKiss is a computer program that implements the Kisekae Set system, KiSS, a Japanese graphics system originally developed to facilitate costume changes on virtual dolls. UltraKiss was developed to help artists build their KiSS sets. It is a full featured viewer for all KiSS dolls, games, and visual applications. It is also a complete graphical development environment for creating KiSS applications. It fully implements the FKiSS event driven programming language up to and including...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14
    vocal-separate

    vocal-separate

    An extremely simple tool for separating vocals and background music

    ...Users can drag and drop an audio or video file onto the interface to begin separation, choosing between two, four, or five stems, which allows isolating specific components like vocals, bass, drums, or piano depending on the chosen model. After processing, the tool outputs separate WAV files for each extracted stem, making it easy to export and use in audio editing or remix software.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    Meqaris

    Meqaris

    Booking/reservation of meeting rooms/equipment with e-mail invitations

    Meqaris (Meeting Equipment and Room Invitation System) is a system that allows booking meeting/conference rooms and other equipment or resources (like mobile whiteboards, projectors or conference audio/video sets) by using the same type of e-mail invitations that are used to invite participants to meetings. Especially useful in corporate environments, but can be used for anything in general, even by individual users. Simply add "resource participants" to the recipient list (just like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    downkyi

    downkyi

    Bilibili video downloader supporting 8K, batch, and toolbox tools

    downkyi is an open-source downloader for Bilibili videos. It features a clean UI, QR-code login, batch downloads, support for 8K, HDR, Dolby Vision, audio/video extraction, watermark removal, and subtitle/danmaku retrieval. It leverages aria2c for multi-threaded downloading and FFmpeg for muxing and processing.
    Downloads: 100 This Week
    Last Update:
    See Project
  • 17
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. The...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 18
    ekho

    ekho

    Chinese text-to-speech engine

    ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Piper

    Piper

    A distributed workflow engine

    Piper is a multimedia-focused tool designed to simplify audio and video processing workflows through streamlined command execution. It acts as a wrapper around FFmpeg-like utilities, enabling users to build pipelines for media transformation with reduced complexity. The project emphasizes automation and reproducibility, allowing consistent handling of media tasks across environments. It supports chaining operations such as encoding, filtering, and conversion in a structured manner. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AudioEqualizer
    Introducing AudioEqualize: Elevate Your Audio Experience! AudioEqualize isn't just your average volume adjustment tool; it's a sophisticated audio wizard that goes beyond simple peak amplitude normalization. Designed to enhance your music library, AudioEqualize meticulously analyzes and precisely tunes your MP3 files to a target volume of your choice. Here's why it's the ultimate choice for audio enthusiasts:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    auto-subtitle

    auto-subtitle

    Automatically generate and overlay subtitles for any video

    auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
    Leader badge
    Downloads: 35 This Week
    Last Update:
    See Project
  • 24
    QMDemo

    QMDemo

    Some functional modules developed by Qt on a daily basis or demos

    QMDemo is an Android demonstration project that showcases multimedia playback and processing capabilities using native and Java-based components. It is designed as a learning tool for developers exploring video playback, decoding, and rendering pipelines on mobile devices. The project includes examples of handling media streams, managing buffers, and synchronizing audio and video output. It demonstrates integration with multimedia libraries and frameworks to achieve efficient playback performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Objective-Oriented Directivity

    MATLAB toolbox for processing directivity models

    The project is a framework developed in the form of a MATLAB toolbox, which aims to bring common interface for various directivity representations in acoustics. The legacy version was described in paper 10521 at 151st Audio Engineering Society Convention (https://arxiv.org/abs/2109.14370). The preprint on the current, improved version, can be found here: https://arxiv.org/abs/2206.12283. Currently not submitted anywhere, please refer to the toolbox by citing this website.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB