Showing 17 open source projects for "audio"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    Groq Python

    Groq Python

    The official Python Library for the Groq API

    Groq Python is the official Python SDK for the Groq REST API, giving Python developers straightforward access to Groq’s LLM, chat, audio, and other AI services. Through this library, you can call Groq’s models from Python code — for example to request chat completions, code generation, transcription, or any supported endpoint — using idiomatic Python syntax. The SDK handles authentication (via environment variable or parameter), defines proper type-safe request/response data types, and supports both synchronous and asynchronous usage patterns depending on your application needs. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 3
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which makes sophisticated agent behaviors easy to express with simple operators. The library offers built-in processors for classic turn-based Gemini calls as well as Live API streaming, so you can mix “batch” and real-time interactions in the same graph. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    pyglet

    pyglet

    pyglet is a cross-platform windowing and multimedia library for Python

    Pyglet is a cross-platform windowing and multimedia library for Python, intended for developing games and other visually rich applications. It supports windowing, input event handling, OpenGL graphics, loading images and videos, and playing sounds and music.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Multimodal

    Multimodal

    TorchMultimodal is a PyTorch library

    ...The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference implementations you can adopt or adapt. The design emphasizes composability: you can mix and match encoder, fusion, and decoder components rather than starting from monolithic models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    lv2gen

    Generate boilerplate for LV2 plugins

    A python package for describing audio and synth plugins and generating boilerplate code from that description, including LV2’s manifest.ttl and a simplified GUI for the`Mod Dwarf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity. AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    ...The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Spleeter

    Spleeter

    Deezer source separation library including pretrained models

    ...It makes it easy to train music source separation models (assuming you have a dataset of isolated sources), and provides already trained state of the art models for performing various flavours of separation. 2 stems and 4 stems models have state of the art performances on the musdb dataset. Spleeter is also very fast as it can perform separation of audio files to 4 stems 100x faster than real-time when run on a GPU. We designed Spleeter so you can use it straight from command line as well as directly in your own development pipeline as a Python library. It can be installed with Conda, with pip or be used with Docker.
    Downloads: 40 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    aeneas

    aeneas

    Automagically synchronize audio and text (aka forced alignment)

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13

    Tygamusic

    A pygame music lib.

    This lib was produced while I was programming an other program/game. I was tired of pygame's bad system of handling playlists and the management of music in general. With this lib I want to create an layer that allows you to interact with the music, how you would expect it. Currently featuring: -Playlist -Normal pausing and resuming (played time isn’t lost when new song is loaded) -Automatic recognition of songs and adding them to a separate list
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    RNNLIB is a recurrent neural network library for sequence learning problems. Applicable to most types of spatiotemporal data, it has proven particularly effective for speech and handwriting recognition. full installation and usage instructions given at http://sourceforge.net/p/rnnl/wiki/Home/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Elucidation is a Python module designed to be an extremely powerful backend for audio and video converters. The aim of the module is to do all the heavy lifting while applications using it are little more than interfaces to it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    These bindings allow you to use the fmod ex sound library from python with nice python api. You can (or, if i must say the truth, will be able to) use any feature you like. Now we're in phase where everyone who wants to help would be appreciated.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    JackPy
    Pure Python bindings for JACK Audio
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB