Showing 50 open source projects for "model-builder"

View related business solutions
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    OpenVINO AI Plugins for Audacity

    OpenVINO AI Plugins for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.
    Downloads: 133 This Week
    Last Update:
    See Project
  • 2
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Moonshine Voice

    Moonshine Voice

    Fast and accurate automatic speech recognition (ASR) for edge devices

    ...Moonshine supports multiple platforms including mobile, desktop, and embedded systems, and provides example projects to accelerate integration into real-world products. The toolkit also includes specialized model variants, including monolingual options that improve accuracy for specific languages. Overall, moonshine serves developers building privacy-conscious, on-device voice interfaces that demand high performance with minimal resource overhead.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional AI assistants typically lack. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 5
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6

    lbat

    lmms projects builder and tagger

    Wrapper atop "lmms render" - keep rendered tracks up to date with lmms projects. Creates EBU R128 compliat tracks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CommBusTerm

    CommBusTerm

    "CommBusTerm"

    "CommBusTerm" is based on a "Raspberry Pi 4 Model B" or "Raspberry Pi 5", and the "kmicommbusmediator" interface, which provides connectivity to the "PPG Communication Bus" which is used by PPG synthesizers such as the "Wave 2.2", "Wave 2.3", "EVU". Hardware: * "kmicommbusmediator": https://sourceforge.net/projects/kmicommbusmediator/ * "kmiwaveram": https://sourceforge.net/projects/kmi-wave-ram/ Software: * "cbmedimonitor" control tool * "WaveProgEdit" sound-program editor * "WaveBackup" backups * "waveaddsynth" for additive synthesis * "wavetransientutil" for transient sounds Copyright (C) 2022-2026 by Klaus Michael Indlekofer. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    kmicommbusmediator

    kmicommbusmediator

    "KMI Comm. Bus Mediator"

    "KMI Comm. Bus Mediator" ("kmicommbusmediator") The project "kmicommbusmediator" provides connectivity for the "Raspberry Pi 4 Model B" and "Raspberry Pi 5" to the "PPG Communication Bus" which is used by PPG devices such as the "Wave 2.2", "Wave 2.3", "EVU", "PRK", "Waveterm A", "Waveterm B". "kmicommbusmediator" contains hardware and software components: * "kmicommbusmediator-HW": "Comm. Bus" hardware-interface * "CBMEDIBIOS": firmware for "kmicommbusmediator-HW" * "cbmedimonitor": control tool for "Raspberry Pi" Utilities: "wtafloppylist", "wtbfloppylist", ... ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    VCClient

    VCClient

    Software that uses AI to perform real-time voice conversion

    VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output. It provides both a...
    Downloads: 32 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10

    Ultimate Media Downloader

    An Open source media downloader for downloading videos and audios

    ...Whether you're downloading a single YouTube video, extracting audio from Spotify playlists, archiving TikTok content, or batch-processing entire music libraries, UMD handles it all with elegance and efficiency. IT CONSISTS OF : 1. Unified Interface: One command, 1000+ platforms. No tool shopping, no mental model switching. 2. Production-Ready, Zero Friction Installation: Most users go from hearing about the tool to downloading content in under 5 minutes. 3. Active Maintenance: Codeberg hosting (after GitHub suspension) demonstrates commitment to long-term availability
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Competent Audio

    Competent Audio

    Machine graph audio engine for computer games

    ...It is written in C, but is designed for interoperability with other languages. Windows and Linux binaries for x86 and amd64 are available. CA uses a machine graph model with support for arbitrary numbers of machines, limited only by the available system resources: - Samplers play back audio clips. - Mixers combine audio signals and optionally perform signal processing. - Sinks send audio signals to an output device. Stereo and mono sound output is supported via a slightly customized version of libsoundio 2.0. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    abcCairo

    Extend abcm2ps to support direct generation of PNG, SVG and PDF files

    ...This project takes out the PostScript commands to generate various musical symbols and instead replaces them with Cairo function calls that generate the same musical symbols. The Cairo graphics library is an open-source graphics library offering a similar 2D graphics model to PostScript. The Cairo library can write to a GTK canvas, allowing integration with programs using the GTK+ toolkit, or it can write to an image file in a choice of formats: PNG, SVG or PDF. This means that other programs can have access to the abcm2ps music-rendering capability without having to incorporate a PostScript interpreter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    NAM-Runner

    Batch file to install and run NAM (neural-amp-modeler) easily.

    A Windows 10 batch file, that installs and runs the NAM model trainer (neural-amp-modeler) by Steven Atkinson right into the GUI application. Fully automated. Custom one-time installation of everything you need to train neural network models of guitar amps and more for the NAM VST plugin, no Conda required. Runs as a launcher afterwards. Portable installation. New pyTorch inclues CUDA runtime for fast Nvidia GPU support.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    EnCodec

    EnCodec

    State-of-the-art deep learning based audio codec

    ...It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. The model can operate in real time and supports variable bandwidths, bitrates, and multi-band audio. Encodec has applications in speech and music compression, generative modeling, and efficient data transmission for communication systems. The repository includes pretrained checkpoints, PyTorch inference code, and examples for integrating Encodec as a module in downstream generative or streaming systems.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    VAD

    VAD

    Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM

    This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. Acoustic feature extraction (multi-resolution cochleagram, MRCG). Post-processing modules (e.g. smoothing, thresholds). The toolkit supports both MATLAB and Python/TensorFlow components (for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    XZVoice

    XZVoice

    Free and open source text-to-speech software

    ...Technically, multi-level rhythmic pauses are taken into account to achieve the purpose of natural synthesizing rhythm, and comprehensively use acoustic parameters and linguistic parameters to establish multiple automatic prediction models based on deep learning. Using massive audio data to train the pronunciation model, the synthetic sound is real, full, cadenced, and expressive, and the MOS score has reached the professional level in the industry.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TTS

    TTS

    Deep learning for text to speech

    ...TTS comes with pre-trained models, tools for measuring dataset quality, and is already used in 20+ languages for products and research projects. Released models in PyTorch, Tensorflow and TFLite. Tools to curate Text2Speech datasets underdataset_analysis. Demo server for model testing. Notebooks for extensive model benchmarking. Modular (but not too much) code base enabling easy testing for new ideas. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings efficiently. Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Carnatic Music Guru / JRaaga

    Carnatic Music Guru / JRaaga

    Carnatic Music Guru - JRaaga

    VISIT THIS PAGE AS MORE FEATURES ARE BEING ADDED. If you have downloaded 2.03 or above - Use Help>Check Updates to download latest version. If that does not update: Download CMGUpdater from here: update/CMGUpdater.jar Copy it to the update/ folder of your JRaaga installation path. Try again Help>Check for updates. If it does not work. Delete existing installation. Download the latest version and try. Carnatic Music Guru is a tutor/player/lesson generator. YOU NEED Java Runtime...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    SoundJS

    SoundJS

    Javascript library for working with Audio

    SoundJS is a library to make working with audio on the web easier. It provides a consistent API for playing audio in different browsers, including using a target plugin model to provide an easy way to provide additional audio plugins like a Flash fallback (included, but must be used separately from the combined/minified version). A mechanism has been provided for easily tying in audio preloading to PreloadJS. The core API for playing sounds. Call createjs.Sound.play(sound, ...options), and a sound instance is created that can be used to control the audio, and dispatches events when it is complete, loops, or is interrupted. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    RuneAudio

    RuneAudio

    Free and open source Hi-Fi music player for embedded hardware

    RuneAudio is a free and open source software that turns embedded hardware into Hi-Fi music players. We want to make a cheap, low-consumption and silent mini-PC perform as an high fidelity digital source. RuneAudio features a custom-built Linux distribution (RuneOS) and a web player (RuneUI) which allows to remote control playback and setup options, from multiple devices (desktop PC, netbook, tablet, smartphone).
    Downloads: 46 This Week
    Last Update:
    See Project
  • 23
    Airtime

    Airtime

    Open source broadcast automation software for scheduling and playout

    Airtime lets you take total control of your radio station via the web with intelligent archive management, powerful search, an easy playlist builder, a simple scheduling calendar and rock-solid automated playout. Features include Smart Blocks, live assist modes, WAV, FLAC, AAC, MP3 and OGG support, fades, cues, playlists, programme calendar, Icecast, Shoutcast and Soundcloud integration, DJ and station manager roles, JQuery widgets, Liquidsoap playout, and a record and rebroadcast functionality. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24

    High-order HMM in Matlab

    Implementation of duration high-order hidden Markov model in Matlab.

    Implementation of duration high-order hidden Markov model (DHO-HMM) in Matlab with application in speech recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    FlashWavRecorder

    FlashWavRecorder

    Simple flash file for recording audio and saving as a WAV

    FlashWavRecorder is a Flash-based tool that enables recording audio from a user's microphone and exporting it as a WAV file in real time. It uses Flash Player to access audio input and communicate with JavaScript to control recording sessions from the browser. This was especially useful before native browser audio APIs became widely supported and remains relevant in legacy systems requiring Flash-based audio capture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next