Showing 15 open source projects for "ai audio"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Google AI Edge Gallery

    Google AI Edge Gallery

    A gallery that showcases on-device ML/GenAI use cases

    Gallery is a curated collection of on-device machine learning examples, demo apps, and model artifacts designed to help developers experiment with and deploy ML at the edge. The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid...
    Downloads: 669 This Week
    Last Update:
    See Project
  • 2
    Pedalboard

    Pedalboard

    A Python library for audio

    ...Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Open Notebook

    Open Notebook

    An Open Source implementation of Notebook LM with more flexibility

    Open Notebook is an open-source, privacy-focused alternative to Google’s Notebook LM that gives users full control over their research and AI workflows. Designed to be self-hosted, it ensures complete data sovereignty by keeping your content local or within your own infrastructure. The platform supports 16+ AI providers—including OpenAI, Anthropic, Ollama, Google, and LM Studio—allowing flexible model choice and cost optimization. Open Notebook enables users to organize and analyze multi-modal content such as PDFs, videos, audio files, web pages, and Office documents. ...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 4
    Jina-Serve

    Jina-Serve

    Build multimodal AI applications with cloud-native stack

    ...Jina Serve focuses on making it easier to turn machine learning models into production-ready services without forcing developers to manage complex infrastructure manually. The framework supports many major machine learning libraries and data types, making it suitable for multimodal AI systems that process text, images, audio, and other inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    AudioMuse-AI is an open-source, Dockerized environment that brings automatic playlist generation to your self-hosted music library. Using tools such as Librosa and ONNX, it performs sonic analysis on your audio files locally, allowing you to curate playlists for any mood or occasion without relying on external APIs. Deploy it easily on your local machine with Docker Compose or Podman, or scale it in a Kubernetes cluster (supports AMD64 and ARM64).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    SimpleTuner

    SimpleTuner

    A general fine-tuning kit geared toward image/video/audio diffusion

    SimpleTuner is an open-source toolkit designed to simplify the fine-tuning of modern diffusion models for generating images, video, and audio. The project focuses on providing a clear and understandable training environment for researchers, developers, and artists who want to customize generative AI models without navigating complex machine learning pipelines. It supports fine-tuning workflows for models such as Stable Diffusion variants and other diffusion architectures, enabling users to adapt pretrained models to specialized datasets or creative tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Triton Inference Server

    Triton Inference Server

    The Triton Inference Server provides an optimized cloud

    Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    MuseGAN

    MuseGAN

    An AI for Music Generation

    MuseGAN is a deep learning research project designed to generate symbolic music using generative adversarial networks. The system focuses specifically on generating multi-track polyphonic music, meaning that it can simultaneously produce multiple instrument parts such as drums, bass, piano, guitar, and strings. Instead of generating raw audio, the model operates on piano-roll representations of music, which encode notes as time-pitch matrices for each instrument track. This representation...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    The core idea is to remove the error sources and difficulties of Deep Learning applications by providing a safe haven of commoditized practices, all available as a single core. While the Open Source Deep Learning Server is the core element, with REST API, and multi-platform support that allows training & inference everywhere, the Deep Learning Platform allows higher level management for training neural network models and using them as if they were simple code snippets. Ready for applications...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Audio AI Timeline

    Audio AI Timeline

    A timeline of the latest AI models for audio generation

    Audio AI Timeline is a curated project that organizes the development of audio-related artificial intelligence into a structured and accessible historical timeline. Rather than functioning as a model training framework, it serves as an informational resource that maps key papers, systems, models, datasets, and milestones across areas such as speech synthesis, music generation, audio understanding, source separation, and general audio machine learning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    hora

    hora

    Efficient approximate nearest neighbor search algorithm collections

    hora is an open-source high-performance vector similarity search library designed for large-scale machine learning and information retrieval systems. The project focuses on approximate nearest neighbor search, a fundamental technique used in modern AI applications such as recommendation systems, image search, and semantic search engines. Hora implements multiple efficient indexing algorithms that allow systems to rapidly search through high-dimensional vectors produced by machine learning models. These vectors are commonly generated by neural networks to represent images, text, audio, or other data types in a mathematical embedding space. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    wav2letter++

    wav2letter++

    Facebook AI research's automatic speech recognition toolkit

    First, install Flashlight (using the 0.3 branch is required) with the ASR application. This repository includes recipes to reproduce the following research papers as well as pre-trained models. All results reproduction must use Flashlight <= 0.3.2 for exact reproducibility. At least one of LZMA, BZip2, or Z is required for LM compression with KenLM. It is highly recommended to build KenLM with position-independent code (-fPIC) enabled, to enable python compatibility. After installing, run...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SMILE = Speech & Music Interpretation by Large Space Extraction openSMILE is a fast, real-time (audio) feature extraction utility for automatic speech, music and paralinguistic recognition research developed originally at TUM in the scope of the EU-project SEMAINE, now maintained and supported by audEERING.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    openEAR is the Munich Open-Source Emotion and Affect Recognition Toolkit developed at the Technische Universität München (TUM). It provides efficient (audio) feature extraction algorithms implemented in C++, classfiers, and pre-trained models on well-known emotion databases. It is now maintained and supported by audEERING. Updates will follow soon.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB