Showing 23 open source projects for "real time"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    LiveAvatar

    LiveAvatar

    Streaming Real-time Audio-Driven Avatar Generation

    LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional AI assistants typically lack. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VCClient

    VCClient

    Software that uses AI to perform real-time voice conversion

    VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output.
    Downloads: 23 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OpenPiano — Virtual Piano for Windows

    OpenPiano — Virtual Piano for Windows

    Desktop piano playable with a PC keyboard, mouse, or MIDI device.

    OpenPiano is a Windows desktop piano application that allows you to play, practice, and record music using your PC keyboard, mouse, or a MIDI device. It supports real-time playback using SoundFonts and provides on-screen piano layouts for visual feedback while playing. OpenPiano is designed to run entirely locally. It does not require accounts, cloud services, or an internet connection for core functionality. Project links: Website: https://www.justagwas.com/projects/openpiano GitHub: https://github.com/Justagwas/openpiano Documentation: https://github.com/Justagwas/openpiano/wiki The application is fully open source. ...
    Leader badge
    Downloads: 128 This Week
    Last Update:
    See Project
  • 7
    SonicDive-8D-Music-Player

    SonicDive-8D-Music-Player

    SonicDive 8D Music Player v-1.0

    SonicDive is an immersive audio visualization & effects-based music player designed to deliver a next-level listening experience. It combines dynamic spectrums with advanced spatial audio effects like 3D & 8D sound. ✨ Features 🎵 Audio Visual Spectrums SonicDive supports multiple real-time audio visualizations: 💿 Disk Spectrum 📊 Bars Spectrum 🌊 Wave Spectrum 🖼️ Thumbnail Spectrum ⭕ Circle Spectrum Each spectrum reacts dynamically to the music’s frequency and intensity. 🎚️ Audio Effects & Modes Choose from a variety of sound profiles to match your mood: 🔊 Flat 🎧 3D Audio 🎧 8D Audio 🎤 Hip-Hop 🎻 Classic 🎸 Rock 🎥 Dolby Effect
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    MagicBox Player
    Magic Box 🎶: The Open-Source Multimedia Player Magic Box is a versatile, custom-built media player for desktop environments, blending a classic interface with powerful, modern features. Developed in Python with PyQt5, it supports a wide range of audio and video formats. Key Features: Dynamic Visualizer: Features a real-time, custom FFT audio spectrum visualizer that monitors system loopback audio, providing vibrant, data-driven feedback (requires manual loopback setup like Stereo Mix/PulseAudio). IPTV/Streaming Ready: Easily load and manage M3U/M3U8 playlists for streaming live TV channels or individual online media streams. Compact Mini Mode: Switch to a Mini Player for a seamless, space-saving playback experience while you multitask. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    EnCodec

    EnCodec

    State-of-the-art deep learning based audio codec

    ...It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. The model can operate in real time and supports variable bandwidths, bitrates, and multi-band audio. Encodec has applications in speech and music compression, generative modeling, and efficient data transmission for communication systems. The repository includes pretrained checkpoints, PyTorch inference code, and examples for integrating Encodec as a module in downstream generative or streaming systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Spleeter

    Spleeter

    Deezer source separation library including pretrained models

    ...It makes it easy to train music source separation models (assuming you have a dataset of isolated sources), and provides already trained state of the art models for performing various flavours of separation. 2 stems and 4 stems models have state of the art performances on the musdb dataset. Spleeter is also very fast as it can perform separation of audio files to 4 stems 100x faster than real-time when run on a GPU. We designed Spleeter so you can use it straight from command line as well as directly in your own development pipeline as a Python library. It can be installed with Conda, with pip or be used with Docker.
    Downloads: 84 This Week
    Last Update:
    See Project
  • 11
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 12
    Pyo Synth

    Pyo Synth

    A GUI to help with pyo synthesizer scripts manipulation.

    Pyo Synth is an open source application that makes the manipulation of pyo scripts easier by letting you control it with a midi keyboard. The interface allows you to setup every control on your keyboard and link them to parameters in your script during runtime. It is also possible to save your progress directly in the pyo script. See manual for more explanation on features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Distant Speech Recognition

    Beamforming and Speech Recognition Toolkit

    BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Strasheela is a constraint-based music composition system. The user defines music theories by sets of compositional rules and the system creates music which complies with these theories. User-interface is the programming language Oz.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    birdsim

    3d bird calls simulator

    Birdsim is a python application which goal is to simulate bird calls in 3d environment. Birdsim's main use-case is to create an ambient, natural sound-scape in homes and offices. Try it in a cold rainy day and feel the force of nature !
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Camposer is a real-time music composition program. With your help and feedback, we hope to grow it into a versatile multi-platform, multi-interface beast capable of making music for anyone from the playful to the serious user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Drum Count is a simple tool which analyzes sound in real-time to measure the number of strokes played on a drumset.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Software for live sampling and audio processing. Algorithmic composition and improvised audio manipulation in real time. The audio engine uses Csound, and the composition logic is built with Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Grace is another csound frontend with standard .csd files as it's main building block. It's main focus is controlling csound in real time from midi, both live from keyboards, in a midi enabled sequencer environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pkaudio is a real-time dsp framework written in C++ that uses a high-performance messaging paradigm to allow non-interrupted use from high-level languages like Python (client code provided).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A modular audio programming language, designed to write applications quickly. Its main goal is real time audio processing, but it should be used for any kind of development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MyParo: My Playlist and Repository Organizer, Explorer-like organizer of MP3 repository and playlists, allows access to real-time playstream (past/present/future), controls an external MP3 player (Winamp/XMMS), cross-platform via wxPython
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A real time sound processing software. It can be used both, for general audio processing/editing and as a fuzzbox. Based on a very flexible plug-in system, it has been coded in python and currently uses portaudio for sound input/output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB