Showing 62 open source projects for "audio"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 133 This Week
    Last Update:
    See Project
  • 3
    Groq Python

    Groq Python

    The official Python Library for the Groq API

    Groq Python is the official Python SDK for the Groq REST API, giving Python developers straightforward access to Groq’s LLM, chat, audio, and other AI services. Through this library, you can call Groq’s models from Python code — for example to request chat completions, code generation, transcription, or any supported endpoint — using idiomatic Python syntax. The SDK handles authentication (via environment variable or parameter), defines proper type-safe request/response data types, and supports both synchronous and asynchronous usage patterns depending on your application needs. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which makes sophisticated agent behaviors easy to express with simple operators. The library offers built-in processors for classic turn-based Gemini calls as well as Live API streaming, so you can mix “batch” and real-time interactions in the same graph. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    txtai

    txtai

    Build AI-powered semantic search applications

    ...Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    pyglet

    pyglet

    pyglet is a cross-platform windowing and multimedia library for Python

    Pyglet is a cross-platform windowing and multimedia library for Python, intended for developing games and other visually rich applications. It supports windowing, input event handling, OpenGL graphics, loading images and videos, and playing sounds and music.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build embeddable chat and search widgets, automate multi-step workflows with AI agents, and integrate via Slack, Telegram, Discord, or REST API. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Multimodal

    Multimodal

    TorchMultimodal is a PyTorch library

    ...The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference implementations you can adopt or adapt. The design emphasizes composability: you can mix and match encoder, fusion, and decoder components rather than starting from monolithic models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. Fast deployment to Kubernetes, Docker Compose and Jina Cloud. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Pybris

    Pybris

    B language compiler written in Python targeting RISVM

    ...The compiler emits a variant of Bitmario RISVM assembly. The practical goal of the project is to provide a way to develop digital signal processing (DSP) effects for the Competent Audio library that is a friendlier alternative to writing RISVM assembly by hand. Pybris is written for Python 2.7, but has also been tested to run with Python 3.8.10.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PyExe - YT DL Mk42 (b) [I.S.A]

    PyExe - YT DL Mk42 (b) [I.S.A]

    PyExe - YouTube Downloader Mark 42 type-B [I.S.A]

    'PyExe - YT DL Mk42 (b)' is an desktop application developed using python 3.6.8 and other add-on libaries. Can download YouTube videos and audios. 'PyExe - YT DL Mk42 (b)' has two parts: 1) Download Video - downloads YouTube video (.mp4) 2) Download Audio - downloads YouTube video (.mp3)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ES-DOS A
    ES-DOS is a application for windows that looks like MS-DOS but is not a OS
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Mopidy

    Mopidy

    Mopidy is an extensible music server written in Python

    ...With Mopidy's extension support, you can easily add backends for new music sources. Mopidy is a Python application that runs in a terminal or in the background on Linux computers or Macs that have network connectivity and audio output. Out of the box, Mopidy is an HTTP server. If you install the Mopidy-MPD extension, it becomes an MPD server too. Many additional frontends for controlling Mopidy are available as extensions. You and the people around you can all connect their favorite MPD or web client to the Mopidy server to search for music and manage the playlist together.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    PyExe - YT DL I.S.A

    PyExe - YT DL I.S.A

    PyExe - YT DL [Improved.Simplified.Alternative]

    'PyExe - YT DL' is an desktop application developed using python 3.6.8 and other add-on libaries. Can download YouTube videos and audios. 'PyExe - YT DL' has two parts: 1) Download Video - downloads YouTube video (.mp4) 2) Download Audio - downloads YouTube video (.mp3) Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Txt-2-Mp3  6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [Improved.Simplified.Alternative]

    'Txt2Mp3' an desktop application developed using python 3.6.8 and other add-on libaries. Can convert texts into audio (.mp3) files using gTTS (Google Text-to-speech) api module library. Compatible only for windows OS.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Debreate - Debian Package Builder

    Debreate - Debian Package Builder

    A utility for creating Debian packages (.deb)

    ...Currently it only supports binary packaging (note that the term "binary package" is used loosely, as such packages can contain scripts & non-code items such as media images, audio, & more) for personal distribution. Plans for using backends such as dh_make & debuild for creating source packages are in the works. But source packaging can be quite different & is a must if you want to get your packages into a distribution's official repositories or a Launchpad Personal Package Archive (PPA). The latter from which Debreate is available.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18

    lv2gen

    Generate boilerplate for LV2 plugins

    A python package for describing audio and synth plugins and generating boilerplate code from that description, including LV2’s manifest.ttl and a simplified GUI for the`Mod Dwarf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    YTD-Downloader

    YTD-Downloader is a GUI-based Desktop Application.

    YTD-Downloader is a GUI-based Desktop Application. Users can download audio and video from YouTube using this software.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity. AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    ...The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Spleeter

    Spleeter

    Deezer source separation library including pretrained models

    ...It makes it easy to train music source separation models (assuming you have a dataset of isolated sources), and provides already trained state of the art models for performing various flavours of separation. 2 stems and 4 stems models have state of the art performances on the musdb dataset. Spleeter is also very fast as it can perform separation of audio files to 4 stems 100x faster than real-time when run on a GPU. We designed Spleeter so you can use it straight from command line as well as directly in your own development pipeline as a Python library. It can be installed with Conda, with pip or be used with Docker.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 23
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    GUIDOLib
    The GUIDOLib provides a powerful engine for the graphic rendering of music scores, based on the Guido Music Notation format. It supports Linux, Mac OS X, Windows, Android and iOS operating systems. A Java JNI interface is available as well as a Javascript version of the library. A Web API has also been designed, allowing to deploy the engine as a Web service.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB