Showing 19 open source projects for "audio samples"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ElevenLabs Python

    ElevenLabs Python

    The official Python SDK for the ElevenLabs API

    elevenlabs-python is the official Python SDK for the ElevenLabs API, giving developers a convenient way to access ElevenLabs’ high-quality, lifelike voices. The library wraps the HTTP API into a typed Python client, so you can perform text-to-speech, streaming, voice cloning, voice management, and agents-related operations with simple method calls. It exposes ElevenLabs’ main models such as Eleven Multilingual v2, Eleven Flash v2.5, and Eleven Turbo v2.5, each targeting different trade-offs...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    OuteTTS is an interface library for running OuteTTS text-to-speech models across a range of backends, making it easier to deploy the same model on different hardware and runtimes. It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VibeVoice ComfyUI

    VibeVoice ComfyUI

    ComfyUI integration for Microsoft's VibeVoice text-to-speech model

    VibeVoice ComfyUI is a comprehensive wrapper that integrates Microsoft’s VibeVoice text-to-speech models directly into ComfyUI workflows. It exposes VibeVoice as a set of custom nodes so you can build single-speaker and multi-speaker voice generation pipelines visually, combining TTS with other audio or generative components. The integration supports high-quality single-speaker synthesis as well as scripted multi-speaker conversations, with optional voice cloning from audio samples for each speaker. It includes advanced control over generation parameters like attention backend, diffusion steps, sampling temperature, guidance scale, and quantization settings, allowing users to tune the trade-offs between quality, VRAM usage, and speed. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Generative AI

    Generative AI

    Sample code and notebooks for Generative AI on Google Cloud

    Generative AI is a comprehensive collection of code samples, notebooks, and demo applications designed to help developers build generative-AI workflows on the Vertex AI platform. It spans multiple modalities—text, image, audio, search (RAG/grounding) and more—showing how to integrate foundation models like the Gemini family into cloud projects. The README emphasises getting started with prompts, datasets, environments and sample apps, making it ideal for both experimentation and production-ready usage. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AudiooPy

    AudiooPy

    Audio manager in Python Object-Oriented Programming

    AudiooPy stands for "Audio Manager in Python Object-Oriented Programming." This library provides a range of useful operations for sound files and audio fragments. It processes audio at the frame level, working with signed integer samples of 8, 16, or 32 bits, stored in byte-like objects. Key features include: - Reading and writing WAV files using Python's standard library
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    SVoice is a PyTorch-based implementation of Facebook Research’s study on speaker voice separation as described in the paper “Voice Separation with an Unknown Number of Multiple Speakers.” This project presents a deep learning framework capable of separating mixed audio sequences where several people speak simultaneously, without prior knowledge of how many speakers are present. The model employs gated neural networks with recurrent processing blocks that disentangle voices over multiple...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9

    pydatascope

    Software oscilloscope using Python and tkinter

    Software oscilloscope using Python and tkinter. Supports multiple sources: socket, file, audio, USB. Displays data by samples, time or frequency. Scales the input automatically or manually.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Deepvoice3_pytorch

    Deepvoice3_pytorch

    PyTorch implementation of convolutional neural networks

    An open source implementation of Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DC-TTS

    DC-TTS

    TensorFlow Implementation of DC-TTS: yet another text-to-speech model

    DC-TTS is a TensorFlow implementation of the DC-TTS architecture, a fully convolutional text-to-speech system designed to be efficiently trainable while producing natural speech. It follows the “Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention” paper, but the author adapts and extends the design to make it practical for real experiments. The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Pyo Synth

    Pyo Synth

    A GUI to help with pyo synthesizer scripts manipulation.

    Pyo Synth is an open source application that makes the manipulation of pyo scripts easier by letting you control it with a midi keyboard. The interface allows you to setup every control on your keyboard and link them to parameters in your script during runtime. It is also possible to save your progress directly in the pyo script. See manual for more explanation on features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Distant Speech Recognition

    Beamforming and Speech Recognition Toolkit

    BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Nautic Radio Broadcast Console

    Nautic Radio Broadcast Console

    Internet radio broadcast console / darkice frontend

    Nautic Radio Broadcast Console is a darkice (shoutcast / icecast source client available at http://darkice.org/ ) frontend, that allows you to play jingles (audio samples), capture a webcam images to send to an FTP, chat with listeners (through an external webchat site) and monitor your audio and system (cpu, network) The program is developed, and used in production, for www.nauticradio.net / www.beatsnbreaks.nl and targeted at relatively old computers running linux. However it is not tested on many platforms, nor guaranteed to be stable. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    superboucle

    superboucle

    Loop based software with jack transport, record and midi controllable

    SuperBoucle is a loop based software fully controllable with any midi device. SuperBoucle is also synced with jack transport. You can use it on live performance or for composition. SuperBoucle is composed of a matrix of sample controllable with external midi device like pad. SuperBoucle will send back information to midi device (light up led). Sample will always start and stop on a beat or group of beats. You can adjust duration of sample (loop period) in beat and offset in beat. But you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    pyscope

    Software oscilloscope using Python and tkinter

    Software oscilloscope using Python and tkinter. Supports multiple sources: socket, file, audio, USB. Displays data by samples, time or frequency. Scales the input automatically or manually. It has been renamed "pdatascope" to avoid a name-clash with Pyscope, a scoping package on PyPi. See https://sourceforge.net/p/pydatascope/ for the latest code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    slurry is a simple python program that plays sounds at random. it is being created primarily for an experimental film screening in June 2010. it will continue to be developed after this.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Application for the "clipping" of wave files into smaller files. Intended for resizing and creating audio samples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The PKSampler is a live-dj tool. It is different from other "live DJ tools" in that it focuses on allowing the user to mix lots of samples at once. The focus is on a simple touchscreen interface that allows quick access to user supplied loops and sampl
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB