Showing 18 open source projects for "ai audio"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 224 This Week
    Last Update:
    See Project
  • 2
    LLM Tornado

    LLM Tornado

    The .NET library to build AI agents with 30+ built-in connectors

    ...It supports multimodal inputs and outputs, including text, images, audio, and documents, making it suitable for a wide range of AI applications. LLMTornado also integrates advanced protocols such as Model Context Protocol and agent-to-agent communication, enabling complex interactions between systems. With built-in support for local deployments, enterprise guardrails, and observability features.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 3
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Groq TypeScript / Node.s

    Groq TypeScript / Node.s

    The official Node.js / Typescript library for the Groq API

    ...With this SDK, developers can call Groq’s models, transcribe audio, perform file uploads — all with minimal boilerplate — which streamlines creation of AI-enabled applications in the JavaScript/TypeScript ecosystem.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    BotSharp

    BotSharp

    AI Multi-Agent Framework in .NET

    Conversation as a platform (CaaP) is the future, so it's perfect that we're already offering the whole toolkits to our .NET developers using the BotSharp AI BOT Platform Builder to build a CaaP. It opens up as much learning power as possible for your own robots and precisely control every step of the AI processing pipeline. BotSharp is an open source machine learning framework for AI Bot platform builder. This project involves natural language understanding, computer vision and audio processing technologies, and aims to promote the development and application of intelligent robot assistants in information systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Groq Python

    Groq Python

    The official Python Library for the Groq API

    Groq Python is the official Python SDK for the Groq REST API, giving Python developers straightforward access to Groq’s LLM, chat, audio, and other AI services. Through this library, you can call Groq’s models from Python code — for example to request chat completions, code generation, transcription, or any supported endpoint — using idiomatic Python syntax. The SDK handles authentication (via environment variable or parameter), defines proper type-safe request/response data types, and supports both synchronous and asynchronous usage patterns depending on your application needs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which makes sophisticated agent behaviors easy to express with simple operators. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Cookbook (Google Gemini)

    Cookbook (Google Gemini)

    Examples and guides for using the Gemini API

    The Gemini Cookbook is an official repository of examples and guides for using Google’s Gemini API. It provides a structured learning path with quick-start tutorials for beginners and practical examples for advanced users. The repository covers a wide range of Gemini capabilities, including text, images, video, speech, robotics, and multimodal interactions. It highlights newly introduced features such as Gemini 2.5 models (Flash and Pro), Gemini’s native image generation, Veo for video...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    DuranDuranbot

    Teachable/trainable artificially intelligent music bot

    A teachable/trainable artificially intelligent music bot fundamentally inspired by how the new wave band Duran Duran composes music. This program utilizes many algorithmic/AI techniques/processes, including machine learning; which allow you to teach/train it to compose music which you prefer... and the technique which is the foundation of the design of DuranDuranbot, which was directly inspired by how Duran Duran writes music........ Called, "bit by bit circular composition"....... and it's...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity. AugLy is a great library to utilize for augmenting your data in model training, or to evaluate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    SynthePG

    Live Music Compositor learning base on TPG

    This android app is producing music and partition sheets in real time. Furthermore, the AI module can be activated and be trained to learn you prefered style and play it. You can also team play your music with the interconnect mode. Finally, move your phone like a 'chef d'orchestre' and see how it changes the music...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    BayesianCortex

    BayesianCortex

    simple algorithm for a realtime interactive visual cortex for painting

    ...In this early version, I'm still working on edge detection and its understanding of the same shapes at different brightnesses. This will be a module of the bigger Human AI Net project and will be used for adding realtime intuitive high dimensional intelligence in audio and visual interactions with the user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BlueWar is a 3D Multiplayer Real-Time Strategy Game (RTS) that features a futuristic combat on the surfaces of various planets in the universe.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Framework for Ufo Clones (FUC) is a cross-platform library for constructing multi-player isometric team warfare games. The platform enables developers to forget tedious programming and focus on, e.g., audio and the graphics, AI, and the game dynamics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A simple, yet powerful framework built around the OGRE rendering system. Has many features in the works such as AI, Audio, an entity system (much like CEL), and a complete level editor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A flexible, modular game engine that can make use of different components (internally developed or existing external libraries) to implement subsystems such as graphics rendering, AI, audio or physics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB