Showing 39 open source projects for "music tools"

View related business solutions
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    YouTube Music Downloader

    YouTube Music Downloader

    A simple app to get songs from YouTube in mp3 format with artist name

    YouTube Music Downloader is a command-line music downloader written in Python that retrieves audio from YouTube and enriches it with detailed metadata from external sources. It combines tools like yt-dlp and FFmpeg to extract high-quality audio while automatically tagging files with artist name, album, release date, and artwork. The application distinguishes itself by integrating metadata providers such as Spotify and iTunes, ensuring that downloaded tracks resemble properly organized music library entries. ...
    Downloads: 64 This Week
    Last Update:
    See Project
  • 2
    Librosa

    Librosa

    Python library for audio and music analysis

    Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a...
    Downloads: 56 This Week
    Last Update:
    See Project
  • 4
    YuE

    YuE

    Open source AI model for generating full songs from lyrics prompts

    ...YuE introduces a family of models built on large language model architectures that process music generation as a sequence prediction task. YuE also incorporates techniques such as track-decoupled prediction and progressive conditioning to help manage complex audio signals and maintain consistency throughout long compositions. It includes inference scripts, prompt examples, evaluation tools, and training components that enable researchers and developers to experiment with AI-based music.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    ...It also includes utilities for visualizing audio features and analyzing patterns within sound recordings, which can be useful in applications such as speech recognition, music classification, and acoustic event detection. Because the library integrates machine learning algorithms with signal processing tools, it enables researchers to develop complete audio analysis pipelines using a single framework.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    AudioMuse-AI

    AudioMuse-AI

    AudioMuse-AI is an Open Source Dockerized environment

    ...AudioMuse-AI integrates with several popular self-hosted music servers including Jellyfin, Navidrome, and Emby, allowing users to extend existing media servers with advanced AI-powered recommendation capabilities. The system uses machine learning and audio analysis tools such as Librosa and ONNX models to extract features directly from audio tracks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    FeelUOwn

    FeelUOwn

    Trying to be a robust, user-friendly and hackable music player

    FeelUOwn is a user-friendly, and hackable music player.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...
    Downloads: 119 This Week
    Last Update:
    See Project
  • 9
    MuseGAN

    MuseGAN

    An AI for Music Generation

    MuseGAN is a deep learning research project designed to generate symbolic music using generative adversarial networks. The system focuses specifically on generating multi-track polyphonic music, meaning that it can simultaneously produce multiple instrument parts such as drums, bass, piano, guitar, and strings. Instead of generating raw audio, the model operates on piano-roll representations of music, which encode notes as time-pitch matrices for each instrument track. This representation...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MiniMax-MCP

    MiniMax-MCP

    Official MiniMax Model Context Protocol (MCP) server

    MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license, with a pyproject.toml and uv-based workflow that makes installation and execution reproducible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    DiffRhythm

    DiffRhythm

    Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

    DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its flexibility makes it ideal for AI-based music production and research in music generation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    MagicBox Player
    Magic Box 🎶: The Open-Source Multimedia Player Magic Box is a versatile, custom-built media player for desktop environments, blending a classic interface with powerful, modern features. Developed in Python with PyQt5, it supports a wide range of audio and video formats. Key Features: Dynamic Visualizer: Features a real-time, custom FFT audio spectrum visualizer that monitors system loopback audio, providing vibrant, data-driven feedback (requires manual loopback setup like...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    Equestria OS

    Equestria OS

    An Arch Linux OS with 20+ custom GUI utilities & MLP theme customizer.

    A user-friendly Linux distribution based on Arch Linux and KDE Plasma 6. Designed by a solo creator as a passion project, this lightweight operating system eliminates the need for terminal commands by offering over 20 unique, built-in graphical utilities for effortless computer management. Equestria OS makes your computer feel like home. It comes with a deeply integrated Proton Engine, allowing you to run your favorite Windows apps and games (.exe) with a simple double-click in secure,...
    Leader badge
    Downloads: 36 This Week
    Last Update:
    See Project
  • 18

    Ultimate Media Downloader

    An Open source media downloader for downloading videos and audios

    Ultimate Media Downloader (UMD) is a professional-grade, open-source command-line tool that consolidates media downloading across 1000+ platforms into a single, intuitive interface. Built with Python and powered by industry-standard extraction engines, it delivers enterprise-level capabilities with consumer-friendly simplicity. Whether you're downloading a single YouTube video, extracting audio from Spotify playlists, archiving TikTok content, or batch-processing entire music libraries, UMD...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    4allDownloader Converter

    4allDownloader Converter

    Video,audio&Files Downloader&Convert with built-in browser with AI.

    The most advanced downloader & converter tool that combines the power of leading open-source technologies under an intuitive GUI. Download videos, Audio, from 10000+ of sites, and also files from any platform with advanced format conversion capabilities. Features a built-in browser with JavaScript injection support that remembers logins while maintaining complete privacy. Five powerful tabs streamline your workflow: Home tab for pasting URLs, channels, playlists, and direct search; Browser...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    ...The repository includes pretrained models for common tasks such as isolating vocals, drums, bass, and accompaniment from stereo music, achieving state-of-the-art results in benchmarks like MUSDB18. Demucs supports GPU-accelerated inference and can process multi-channel audio with chunked streaming for real-time or batch operation. It also provides training scripts and utilities to fine-tune on custom datasets, along with remixing and enhancement tools.
    Downloads: 101 This Week
    Last Update:
    See Project
  • 21
    Super Easy AI Installer Tool

    Super Easy AI Installer Tool

    Application that simplifies the installation of AI-related projects

    ...The tool is designed to provide an easy-to-use solution for accessing and installing AI repositories with minimal technical hassle to none the tool will automatically handle the installation process, making it easier for users to access and use AI tools. "Super Easy AI Installer Tool" is currently in early development phase and may have a few bugs. But remains a great solution for users with minimal technical knowledge or expertise. Fixes underway. A tool that can generate animations and music from text, ideal for producing short videos and GIFs, as well as creating brief cinematic scenes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Audio Webui

    Audio Webui

    A webui for different audio related Neural Networks

    Audio Webui is a Gradio-based web user interface that unifies a wide range of audio-related neural networks under a single, accessible front end. It is designed as an “all-in-one” environment where users can experiment with text-to-speech, voice cloning, generative music, and other neural audio models without writing boilerplate code. The project supports multiple back-end models and toolchains (such as Bark, RVC, AudioLDM, Audiocraft, and other text-to-audio or voice-cloning tools), exposing them through a consistent UI for inference and experimentation. Installation is streamlined through automatic installers and platform-specific scripts that create a virtual environment, install dependencies, and launch the web app with minimal manual setup. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Riffusion

    Riffusion

    Real-time music generation using stable diffusion techniques AI

    ...Riffusion (hobby) serves as the core implementation for audio and image processing, providing essential building blocks for generating music from text prompts. It includes both developer-oriented tools and user-facing components such as a command-line interface and an interactive Streamlit application for experimentation. Additionally, it can run as a Flask server to expose model inference through an API, enabling integration with other applications or services.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    dupeGuru

    dupeGuru

    Find duplicate files

    dupeGuru is a cross-platform GUI application written in Python (with Qt/Cocoa UI) that quickly detects duplicate files on your computer using flexible scanning modes—including filename fuzzy matching, content comparison, and specialized Music/Picture modes. On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with which pyrcc5. Debian based systems need the extra package, and Arch does not. ...
    Downloads: 189 This Week
    Last Update:
    See Project
  • 25
    Telegram WebRTC (VoIP)

    Telegram WebRTC (VoIP)

    Voice chats, private incoming and outgoing calls in Telegram

    ...The library is built on top of low-level communication protocols, ensuring efficient handling of real-time media streams. It supports integration with FFmpeg and other tools for processing audio and video before transmission. tgcalls allows developers to create bots that can play music, stream content, or interact with live voice channels programmatically. It also supports cross-platform usage and integration into larger automation systems. Overall, it serves as a powerful toolkit for building real-time communication features within Telegram ecosystems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo