audio analysis free download

29 projects for "audio analysis" with 2 filters applied:

Artificial Intelligence BSD Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Earn up to 16% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...

Downloads: 1 This Week

Last Update: 2026-03-10
See Project
2

AudioMuse-AI

AudioMuse-AI is an Open Source Dockerized environment

...AudioMuse-AI integrates with several popular self-hosted music servers including Jellyfin, Navidrome, and Emby, allowing users to extend existing media servers with advanced AI-powered recommendation capabilities. The system uses machine learning and audio analysis tools such as Librosa and ONNX models to extract features directly from audio tracks.

Downloads: 10 This Week

Last Update: 2 days ago
See Project
3

NeuralNote

Audio Plugin for Audio to MIDI transcription using deep learning

NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...

Downloads: 91 This Week

Last Update: 2026-03-12
See Project
4

Ultravox

Fast multimodal LLM for real-time voice interaction and AI apps

...Ultravox is optimized for low latency, achieving fast response times suitable for interactive voice agents and real-time applications. It supports use cases such as conversational AI agents, speech-to-speech translation, and analysis of spoken audio content. Ultravox also includes tooling and configuration systems for training, evaluation, and dataset integration.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

MediaPipe Solutions

Cross-platform, customizable ML solutions

...MediaPipe is widely used in computer vision and multimedia applications such as hand tracking, face detection, pose estimation, object recognition, and gesture analysis. The framework includes prebuilt solutions that developers can quickly integrate into applications as well as lower-level APIs that allow custom pipeline construction.

Downloads: 1 This Week

Last Update: 2026-04-23
See Project
6

SALMONN family

A suite of advanced multi-modal LLMs

SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...

Downloads: 0 This Week

Last Update: 2026-04-20
See Project
7

Vidi2

Large Multimodal Models for Video Understanding and Editing

...Vidi targets applications like intelligent video editing, automated video search, content analysis, and editing assistance, enabling users to efficiently locate relevant segments and objects in hours-long footage. The system is built with open-source release in mind, giving developers access to model code, inference scripts, and evaluation pipelines so they can reproduce research results or integrate Vidi into their own video-processing workflows.

Downloads: 1 This Week

Last Update: 2026-03-04
See Project
8

Amphion

Toolkit for audio, music, and speech generation

Amphion is a toolkit from OpenMMLab dedicated to audio, music, and speech generation, aimed at both reproducible research and helping newcomers get started in generative audio. It provides standardized implementations and recipes for classic and state-of-the-art generative models in audio, including TTS, music generation, and voice conversion. A distinctive feature of Amphion is its emphasis on visualization: it offers interactive visualizations of model architectures and generation...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
9

AudioMuse-AI

AudioMuse-AI is an open-source, Dockerized environment that brings automatic playlist generation to your self-hosted music library. Using tools such as Librosa and ONNX, it performs sonic analysis on your audio files locally, allowing you to curate playlists for any mood or occasion without relying on external APIs. Deploy it easily on your local machine with Docker Compose or Podman, or scale it in a Kubernetes cluster (supports AMD64 and ARM64). It integrates with the main music servers' APIs such as Jellyfin, Navidrome, LMS, Lyrion, and Emby. ...

Downloads: 0 This Week

Last Update: 2026-02-01
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

sourcesinc

Source code from the Research Institute for Signals, Systems and Computational Intelligence http://fich.unl.edu.ar/sinc

Downloads: 8 This Week

Last Update: 2023-12-05
See Project
11

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...

3 Reviews

Downloads: 0 This Week

Last Update: 2023-04-23
See Project
12

Piano transcription

Task of transcribing piano recordings into MIDI files

...By using this transcription tool, users can transform live performance audio (or recordings) into editable, machine-readable MIDI — enabling tasks such as analysis, editing, remixing, or generation of piano music. The authors used this system to build a large-scale classical piano MIDI dataset (see next project), but as a standalone tool it enables researchers, musicians, or hobbyists to transcribe their own piano recordings automatically.

Downloads: 3 This Week

Last Update: 2025-12-02
See Project
13

RAVL, Recognition And Vision Library.

General C++ Library, with modules for Computer Vision, Pattern Recognition and much more.

Downloads: 0 This Week

Last Update: 2020-04-22
See Project
14

jMIR

Music research software

jMIR is an open-source software suite implemented in Java for use in music information retrieval (MIR) research. It can be used to study music in the form of audio recordings, symbolic encodings and lyrical transcriptions, and can also mine cultural information from the Internet. It also includes tools for managing and profiling large music collections and for checking audio for production errors. jMIR includes software for extracting features, applying machine learning algorithms, applying...

3 Reviews

Downloads: 10 This Week

Last Update: 2018-06-25
See Project
15

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 39 This Week

Last Update: 2015-10-06
See Project
16

Accelerated Feature Extraction Tool

A fast GPU accelerated feature extraction software for speech analysis

A fast feature extraction software tool for speech analysis and processing. It incorporates standard MFCC, PLP, and TRAPS features. The tool is a specially designed to process very large audio data sets. It uses GPU acceleration if compatible GPU available (CUDA as weel as OpenCL, NVIDIA, AMD, and Intel GPUs are supported). CPU SSE intrinsic instruction set is used in cases where no compatible GPU present.

1 Review

Downloads: 0 This Week

Last Update: 2015-05-25
See Project
17

millosh's workshop

A collection of software made by Milos Rancic.

Downloads: 1 This Week

Last Update: 2016-09-23
See Project
18

Voice Recognition Algorithm

1.) Investigation with cosine transform, and anti transform algorithm, with some voice recognition code. 2.) Translator: Croatian, English. 3.) 2D to 3D picture algorithm (principle) and new 2Dto3D video conversion code with AviSynth video scripting

Downloads: 0 This Week

Last Update: 2015-12-02
See Project
19

Diglo

Diglo is a Music Information Retrieval System based on Computer Vision and Audio Spectrum Analysis, using algorithmic operations to find emergent patterns in musical performance. Also it functions as a low-cost Motion Capture Analysis system.

Downloads: 0 This Week

Last Update: 2015-11-09
See Project
20

Feature Extraction plugin API

Easy-to-use platform-independent plugin API for the extraction of low-level features from audio data in PCM format, as required in the context of music information retrieval software.

Downloads: 0 This Week

Last Update: 2013-04-17
See Project
21

Segfried

Segfried is an audio feature extraction and segmentation utility

Downloads: 0 This Week

Last Update: 2013-03-27
See Project
22

TSSBank

TSSBank is written in c#(.Net 2.0).The main aimed group is the disabled persons.This component outputs voice & textual outputs (with value/words)plus experimental Voice Recognition (VR) system that identifies more then 80% accurately with out training.

Downloads: 0 This Week

Last Update: 2013-03-20
See Project
23

MusiComp

musicomp is a program which most important element is an evolutionary algorithm which uses data mining methods as a fitness function to generate monophone melodies.

Downloads: 0 This Week

Last Update: 2014-06-20
See Project
24

evomusic

Evomusic deals with automatic composition of music (midi-files) using evolutionary algorithms. Currently we are testing multiple approaches for doing this successfully, especially neural networks and/or algorithms based on simple music theory.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
25

Kainoa Biometric User Authentication

The purpose of this project is to provide a biometric security solution by using voice print, fingerprint and/or facial recognition along with a password and/or smart card support using AES to protect data. Please read forums for if interested.

Downloads: 0 This Week

Last Update: 2015-08-03
See Project